Teses / dissertações sobre o tema "Apprentissage de la similarité"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Veja os 50 melhores trabalhos (teses / dissertações) para estudos sobre o assunto "Apprentissage de la similarité".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Veja as teses / dissertações das mais diversas áreas científicas e compile uma bibliografia correta.
Risser-Maroix, Olivier. "Similarité visuelle et apprentissage de représentations". Electronic Thesis or Diss., Université Paris Cité, 2022. http://www.theses.fr/2022UNIP7327.
Texto completo da fonteThe objective of this CIFRE thesis is to develop an image search engine, based on computer vision, to assist customs officers. Indeed, we observe, paradoxically, an increase in security threats (terrorism, trafficking, etc.) coupled with a decrease in the number of customs officers. The images of cargoes acquired by X-ray scanners already allow the inspection of a load without requiring the opening and complete search of a controlled load. By automatically proposing similar images, such a search engine would help the customs officer in his decision making when faced with infrequent or suspicious visual signatures of products. Thanks to the development of modern artificial intelligence (AI) techniques, our era is undergoing great changes: AI is transforming all sectors of the economy. Some see this advent of "robotization" as the dehumanization of the workforce, or even its replacement. However, reducing the use of AI to the simple search for productivity gains would be reductive. In reality, AI could allow to increase the work capacity of humans and not to compete with them in order to replace them. It is in this context, the birth of Augmented Intelligence, that this thesis takes place. This manuscript devoted to the question of visual similarity is divided into two parts. Two practical cases where the collaboration between Man and AI is beneficial are proposed. In the first part, the problem of learning representations for the retrieval of similar images is still under investigation. After implementing a first system similar to those proposed by the state of the art, one of the main limitations is pointed out: the semantic bias. Indeed, the main contemporary methods use image datasets coupled with semantic labels only. The literature considers that two images are similar if they share the same label. This vision of the notion of similarity, however fundamental in AI, is reductive. It will therefore be questioned in the light of work in cognitive psychology in order to propose an improvement: the taking into account of visual similarity. This new definition allows a better synergy between the customs officer and the machine. This work is the subject of scientific publications and a patent. In the second part, after having identified the key components allowing to improve the performances of thepreviously proposed system, an approach mixing empirical and theoretical research is proposed. This secondcase, augmented intelligence, is inspired by recent developments in mathematics and physics. First applied tothe understanding of an important hyperparameter (temperature), then to a larger task (classification), theproposed method provides an intuition on the importance and role of factors correlated to the studied variable(e.g. hyperparameter, score, etc.). The processing chain thus set up has demonstrated its efficiency byproviding a highly explainable solution in line with decades of research in machine learning. These findings willallow the improvement of previously developed solutions
Grimal, Clément. "Apprentissage de co-similarités pour la classification automatique de données monovues et multivues". Thesis, Grenoble, 2012. http://www.theses.fr/2012GRENM092/document.
Texto completo da fonteMachine learning consists in conceiving computer programs capable of learning from their environment, or from data. Different kind of learning exist, depending on what the program is learning, or in which context it learns, which naturally forms different tasks. Similarity measures play a predominant role in most of these tasks, which is the reason why this thesis focus on their study. More specifically, we are focusing on data clustering, a so called non supervised learning task, in which the goal of the program is to organize a set of objects into several clusters, in such a way that similar objects are grouped together. In many applications, these objects (documents for instance) are described by their links to other types of objects (words for instance), that can be clustered as well. This case is referred to as co-clustering, and in this thesis we study and improve the co-similarity algorithm XSim. We demonstrate that these improvements enable the algorithm to outperform the state of the art methods. Additionally, it is frequent that these objects are linked to more than one other type of objects, the data that describe these multiple relations between these various types of objects are called multiview. Classical methods are generally not able to consider and use all the information contained in these data. For this reason, we present in this thesis a new multiview similarity algorithm called MVSim, that can be considered as a multiview extension of the XSim algorithm. We demonstrate that this method outperforms state of the art multiview methods, as well as classical approaches, thus validating the interest of the multiview aspect. Finally, we also describe how to use the MVSim algorithm to cluster large-scale single-view data, by first splitting it in multiple subsets. We demonstrate that this approach allows to significantly reduce the running time and the memory footprint of the method, while slightly lowering the quality of the obtained clustering compared to a straightforward approach with no splitting
Boutin, Luc. "Biomimétisme, génération de trajectoires pour la robotique humanoïde à partir de mouvements humains". Poitiers, 2009. http://theses.edel.univ-poitiers.fr/theses/2009/Boutin-Luc/2009-Boutin-Luc-These.pdf.
Texto completo da fonteThe true reproduction of human locomotion is a topical issue on humanoid robots. The goal of this work is to define a process to imitate the human motion with humanoid robots. In the first part, the motion capture techniques are presented. The measurement protocol adopted is exposed and the calculation of joint angles. An adaptation of three existing algorithms is proposed to detect the contact events during complex movements. The method is valided by measurements on thirty healthy subjects. The second part deals with the generation of humanoid trajectories imitating the human motion. Once the problem and the imitation process are defined, the balance criterion of walking robots is presented. Using data from human motion capture, the reference trajectories of the feet and ZMP are defined. These paths are modified to avoid collision between feet, particularly in the case of executing a slalom. Finally an inverse kinematics algorithm developed for this problem is used to determine the joint angles associated with the robot reference trajectories of the feet and ZMP. Several applications on robots HOAP-3 and HRP-2 are presented. The trajectories are validated according to the robot balance through dynamic simulations of the computed motion, and respecting the limits of actuators
Grimal, Clement. "Apprentissage de co-similarités pour la classification automatique de données monovues et multivues". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00819840.
Texto completo da fonteBoyer, Laurent. "Apprentissage probabiliste de similarités d'édition". Phd thesis, Université Jean Monnet - Saint-Etienne, 2011. http://tel.archives-ouvertes.fr/tel-00718835.
Texto completo da fonteVogel, Robin. "Similarity ranking for biometrics : theory and practice". Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT031.
Texto completo da fonteThe rapid growth in population, combined with the increased mobility of people has created a need for sophisticated identity management systems.For this purpose, biometrics refers to the identification of individuals using behavioral or biological characteristics. The most popular approaches, i.e. fingerprint, iris or face recognition, are all based on computer vision methods. The adoption of deep convolutional networks, enabled by general purpose computing on graphics processing units, made the recent advances incomputer vision possible. These advances have led to drastic improvements for conventional biometric methods, which boosted their adoption in practical settings, and stirred up public debate about these technologies. In this respect, biometric systems providers face many challenges when learning those networks.In this thesis, we consider those challenges from the angle of statistical learning theory, which leads us to propose or sketch practical solutions. First, we answer to the proliferation of papers on similarity learningfor deep neural networks that optimize objective functions that are disconnected with the natural ranking aim sought out in biometrics. Precisely, we introduce the notion of similarity ranking, by highlighting the relationship between bipartite ranking and the requirements for similarities that are well suited to biometric identification. We then extend the theory of bipartite ranking to this new problem, by adapting it to the specificities of pairwise learning, particularly those regarding its computational cost. Usual objective functions optimize for predictive performance, but recentwork has underlined the necessity to consider other aspects when training a biometric system, such as dataset bias, prediction robustness or notions of fairness. The thesis tackles all of those three examplesby proposing their careful statistical analysis, as well as practical methods that provide the necessary tools to biometric systems manufacturers to address those issues, without jeopardizing the performance of their algorithms
Philippeau, Jérémy. "Apprentissage de similarités pour l'aide à l'organisation de contenus audiovisuels". Toulouse 3, 2009. http://thesesups.ups-tlse.fr/564/.
Texto completo da fonteIn the perspective of new usages in the field of the access to audiovisual archives, we have created a semi-automatic system that helps a user to organize audiovisual contents while performing tasks of classification, characterization, identification and ranking. To do so, we propose to use a new vocabulary, different from the one already available in INA documentary notices, to answer needs which can not be easily defined with words. We have conceived a graphical interface based on graph formalism designed to express an organisational task. The digital similarity is a good tool in respect with the handled elements which are informational objects shown on the computer screen and the automatically extracted audio and video low-level features. We have made the choice to estimate the similarity between those elements with a predictive process through a statistical model. Among the numerous existing models, the statistical prediction based on the univaried regression and on support vectors has been chosen. H)
Qamar, Ali Mustafa. "Mesures de similarité et cosinus généralisé : une approche d'apprentissage supervisé fondée sur les k plus proches voisins". Phd thesis, Université de Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00591988.
Texto completo da fonteAseervatham, Sujeevan. "Apprentissage à base de Noyaux Sémantiques pour le Traitement de Données Textuelles". Phd thesis, Université Paris-Nord - Paris XIII, 2007. http://tel.archives-ouvertes.fr/tel-00274627.
Texto completo da fonteDans le cadre de cette thèse, nous nous intéressons principalement à deux axes.
Le premier axe porte sur l'étude des problématiques liées au traitement de données textuelles structurées par des approches à base de noyaux. Nous présentons, dans ce contexte, un noyau sémantique pour les documents structurés en sections notamment sous le format XML. Le noyau tire ses informations sémantiques à partir d'une source de connaissances externe, à savoir un thésaurus. Notre noyau a été testé sur un corpus de documents médicaux avec le thésaurus médical UMLS. Il a été classé, lors d'un challenge international de catégorisation de documents médicaux, parmi les 10 méthodes les plus performantes sur 44.
Le second axe porte sur l'étude des concepts latents extraits par des méthodes statistiques telles que l'analyse sémantique latente (LSA). Nous présentons, dans une première partie, des noyaux exploitant des concepts linguistiques provenant d'une source externe et des concepts statistiques issus de la LSA. Nous montrons qu'un noyau intégrant les deux types de concepts permet d'améliorer les performances. Puis, dans un deuxième temps, nous présentons un noyau utilisant des LSA locaux afin d'extraire des concepts latents permettant d'obtenir une représentation plus fine des documents.
Qamar, Ali Mustafa. "Mesures de similarité et cosinus généralisé : une approche d'apprentissage supervisé fondée sur les k plus proches voisins". Phd thesis, Grenoble, 2010. http://www.theses.fr/2010GRENM083.
Texto completo da fonteAlmost all machine learning problems depend heavily on the metric used. Many works have proved that it is a far better approach to learn the metric structure from the data rather than assuming a simple geometry based on the identity matrix. This has paved the way for a new research theme called metric learning. Most of the works in this domain have based their approaches on distance learning only. However some other works have shown that similarity should be preferred over distance metrics while dealing with textual datasets as well as with non-textual ones. Being able to efficiently learn appropriate similarity measures, as opposed to distances, is thus of high importance for various collections. If several works have partially addressed this problem for different applications, no previous work is known which has fully addressed it in the context of learning similarity metrics for kNN classification. This is exactly the focus of the current study. In the case of information filtering systems where the aim is to filter an incoming stream of documents into a set of predefined topics with little supervision, cosine based category specific thresholds can be learned. Learning such thresholds can be seen as a first step towards learning a complete similarity measure. This strategy was used to develop Online and Batch algorithms for information filtering during the INFILE (Information Filtering) track of the CLEF (Cross Language Evaluation Forum) campaign during the years 2008 and 2009. However, provided enough supervised information is available, as is the case in classification settings, it is usually beneficial to learn a complete metric as opposed to learning thresholds. To this end, we developed numerous algorithms for learning complete similarity metrics for kNN classification. An unconstrained similarity learning algorithm called SiLA is developed in which case the normalization is independent of the similarity matrix. SiLA encompasses, among others, the standard cosine measure, as well as the Dice and Jaccard coefficients. SiLA is an extension of the voted perceptron algorithm and allows to learn different types of similarity functions (based on diagonal, symmetric or asymmetric matrices). We then compare SiLA with RELIEF, a well known feature re-weighting algorithm. It has recently been suggested by Sun and Wu that RELIEF can be seen as a distance metric learning algorithm optimizing a cost function which is an approximation of the 0-1 loss. We show here that this approximation is loose, and propose a stricter version closer to the the 0-1 loss, leading to a new, and better, RELIEF-based algorithm for classification. We then focus on a direct extension of the cosine similarity measure, defined as a normalized scalar product in a projected space. The associated algorithm is called generalized Cosine simiLarity Algorithm (gCosLA). All of the algorithms are tested on many different datasets. A statistical test, the s-test, is employed to assess whether the results are significantly different. GCosLA performed statistically much better than SiLA on many of the datasets. Furthermore, SiLA and gCosLA were compared with many state of the art algorithms, illustrating their well-foundedness
Dhouib, Sofiane. "Contributions to unsupervised domain adaptation : Similarity functions, optimal transport and theoretical guarantees". Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEI117.
Texto completo da fonteThe surge in the quantity of data produced nowadays made of Machine Learning, a subfield of Artificial Intelligence, a vital tool used to extract valuable patterns from them and allowed it to be integrated into almost every aspect of our everyday activities. Concretely, a machine learning algorithm learns such patterns after being trained on a dataset called the training set, and its performance is assessed on a different set called the testing set. Domain Adaptation is an active research area of machine learning, in which the training and testing sets are not assumed to stem from the same probability distribution, as opposed to Supervised Learning. In this case, the two distributions generating the training and testing data correspond respectively to the source and target domains. Our contributions focus on three theoretical aspects related to domain adaptation for classification tasks. The first one is learning with similarity functions, which deals with classification algorithms based on comparing an instance to other examples in order to decide its class. The second is large-margin classification, which concerns learning classifiers that maximize the separation between classes. The third is Optimal Transport that formalizes the principle of least effort for transporting probability masses between two distributions. At the beginning of the thesis, we were interested in learning with so-called (epsilon,gamma,tau)-good similarity functions in the domain adaptation framework, since these functions have been introduced in the literature in the classical framework of supervised learning. This is the subject of our first contribution in which we theoretically study the performance of a similarity function on a target distribution, given it is suitable for the source one. Then, we tackle the more general topic of large-margin classification in domain adaptation, with weaker assumptions than those adopted in the first contribution. In this context, we proposed a new theoretical study and a domain adaptation algorithm, which is our second contribution. We derive novel bounds taking the classification margin on the target domain into account, that we convexify by leveraging the appealing Optimal Transport theory, in order to derive a domain adaptation algorithm with an adversarial variation of the classic Kantorovich problem. Finally, after noticing that our adversarial formulation can be generalized to include several other cases of interest, we dedicate our last contribution to adversarial or minimax variations of the optimal transport problem, where we demonstrate the versatility of our approach
Gresse, Adrien. "L'Art de la Voix : Caractériser l'information vocale dans un choix artistique". Thesis, Avignon, 2020. http://www.theses.fr/2020AVIG0236.
Texto completo da fonteTo reach an international audience, audiovisual productions (films, TVshows, video games) must be translated into other languages. Generally, theoriginal voice is replaced by a new voice in the target language. This processis referred as dubbing. The voice casting process aimed at choosing avoice (an actor) in accordance with the original voice and the character, isperformed manually by an artistic director (AD). Today, ADs are looking fornew "talents" (less expensive and more available than experienced dubbers),but they cannot perform large-scale auditions. Automatic tools capable ofmeasuring the adequacy between a voice in a source language with a voicein a target language/culture and a given context is of great interest for audiovisualcompanies. In addition, beyond voice casting, this voice selectionproblematic echoes the major scientific questions of voice similarity andperception mechanism.In this work, we use the voices of professional actors selected by ADs indifferent languages from already dubbed works. First, we set up a protocolwith state-of-the-art methods in automatic speaker recognition to highlightthe existence of character/role specific information in our data. Wealso identify the influence of linguistic bias on the performance of the system.Then, we build methodological framework to evaluate the ability ofan automatic system to discriminate pairs of voices playing the same character.The system we created is based on Siamese Neural Networks. In thisevaluation protocol, we apply strong constraints to avoid possible biases(linguistic content, gender, etc.) and we learn a similarity measure that reflectsthe AD’s choices with a significant difference that is not attributed tochance. Finally, we train a new representational space highlighting the characterspecific information, called p-vector. Thanks to our methodologicalframework, we show that this representation allows to better discriminatethe voices of new characters, in comparison to a representation oriented onthe speaker information. In addition, we show that it is possible to benefitfrom the generalized knowledge of a model learned on a similar dataset using knowledge distillation in neural networks.This thesis gives a initial answer for assisted voice casting and providesautomatic tools capable of preselecting the relevant voices from a large setof voices in a target language. Despite the fact that the information characteristicof an artistic choice can be extracted from a large volume of data,even if this choice is difficult to formalize, we still have to highlight the explanatoryfactors of the decision of the system.We would like to explain, inaddition to the selection of voices, the reasons of this choice. Furthermore,understanding the decision process of the system would help us define the"voice palette". In future work, we would like to explore the influence of thetarget language and culture by extending our work to more languages. Inthe longer term, this work could help to understand how voice perceptionhas evolved since the beginning of dubbing
Brezellec, Pierre. "Techniques d'apprentissage par explication et détections de similarités". Paris 13, 1992. http://www.theses.fr/1992PA132033.
Texto completo da fonteMichaud, Dorian. "Indexation bio-inspirée pour la recherche d'images par similarité". Thesis, Poitiers, 2018. http://www.theses.fr/2018POIT2288/document.
Texto completo da fonteImage Retrieval is still a very active field of image processing as the number of available image datasets continuously increases.One of the principal objectives of Content-Based Image Retrieval (CBIR) is to return the most similar images to a given query with respect to their visual content.Our work fits in a very specific application context: indexing small expert image datasets, with no prior knowledge on the images. Because of the image complexity, one of our contributions is the choice of effective descriptors from literature placed in direct competition.Two strategies are used to combine features: a psycho-visual one and a statistical one.In this context, we propose an unsupervised and adaptive framework based on the well-known bags of visual words and phrases models that select relevant visual descriptors for each keypoint to construct a more discriminative image representation.Experiments show the interest of using this this type of methodologies during a time when convolutional neural networks are ubiquitous.We also propose a study about semi interactive retrieval to improve the accuracy of CBIR systems by using the knowledge of the expert users
Champesme, Marc. "Apprentissage par détection de similarités utilisant le formalisme des graphes conceptuels". Paris 13, 1993. http://www.theses.fr/1993PA132004.
Texto completo da fonteBenhabiles, Halim. "3D-mesh segmentation : automatic evaluation and a new learning-based method". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00834344.
Texto completo da fonteNgo, Duy Hoa. "Enhancing Ontology Matching by Using Machine Learning, Graph Matching and Information Retrieval Techniques". Thesis, Montpellier 2, 2012. http://www.theses.fr/2012MON20096/document.
Texto completo da fonteIn recent years, ontologies have attracted a lot of attention in the Computer Science community, especially in the Semantic Web field. They serve as explicit conceptual knowledge models and provide the semantic vocabularies that make domain knowledge available for exchange and interpretation among information systems. However, due to the decentralized nature of the semantic web, ontologies are highlyheterogeneous. This heterogeneity mainly causes the problem of variation in meaning or ambiguity in entity interpretation and, consequently, it prevents domain knowledge sharing. Therefore, ontology matching, which discovers correspondences between semantically related entities of ontologies, becomes a crucial task in semantic web applications.Several challenges to the field of ontology matching have been outlined in recent research. Among them, selection of the appropriate similarity measures as well as configuration tuning of their combination are known as fundamental issues that the community should deal with. In addition, verifying the semantic coherent of the discovered alignment is also known as a crucial task. Furthermore, the difficulty of the problem grows with the size of the ontologies. To deal with these challenges, in this thesis, we propose a novel matching approach, which combines different techniques coming from the fields of machine learning, graph matching and information retrieval in order to enhance the ontology matching quality. Indeed, we make use of information retrieval techniques to design new effective similarity measures for comparing labels and context profiles of entities at element level. We also apply a graph matching method named similarity propagation at structure level that effectively discovers mappings by exploring structural information of entities in the input ontologies. In terms of combination similarity measures at element level, we transform the ontology matching task into a classification task in machine learning. Besides, we propose a dynamic weighted sum method to automatically combine the matching results obtained from the element and structure level matchers. In order to remove inconsistent mappings, we design a new fast semantic filtering method. Finally, to deal with large scale ontology matching task, we propose two candidate selection methods to reduce computational space.All these contributions have been implemented in a prototype named YAM++. To evaluate our approach, we adopt various tracks namely Benchmark, Conference, Multifarm, Anatomy, Library and Large BiomedicalOntologies from the OAEI campaign. The experimental results show that the proposed matching methods work effectively. Moreover, in comparison to other participants in OAEI campaigns, YAM++ showed to be highly competitive and gained a high ranking position
Alliod, Charlotte. "Conception et modélisation de nouvelles molécules hautement énergétiques en fonction des contraintes réglementaires et environnementales". Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1035.
Texto completo da fonteFor the last two decades, the military research has focused on the improvement of explosive performances, while taking into account their environmental and toxicological impacts. These issues are governed by strict regulations: REACh (Registration, Evaluation, Authorization and Restriction of Chemicals) to ensure a high level of health and environmental protection.Today, it's a major consideration to develop High Energetic Materials (HEM) or molecules who's hazard on human health and environment are reduced. Thus, in collaboration with Airbus Safran Lauchers (ASL), a research program was set up to obtain optimized tools for predicting the potential toxicity of HEM and to design new non-toxic and regulatory molecules.Different in silico methods have been used, including Quantitative Structure Activity Activity Relationships (QSARs) and Machine Learning.The search for structural similarity among molecules is an innovative tool on which we based our predictions in silico. This similarity is obtained thanks to an intelligent algorithm developed within the Pole Rhone Alpin de Bio-Informatique of Lyon which gave rise to a patent. This algorithm allows us to obtain more accurate predictions based on experimental data from European directives
Zhou, Zhyiong. "Recherche d'images par le contenu application à la proposition de mots clés". Thesis, Poitiers, 2018. http://www.theses.fr/2018POIT2254.
Texto completo da fonteThe search for information in masses of multimedia data and the indexing of these large databases by the content are very current problems. They are part of a type of data management called Digital Asset Management (or DAM) ; The DAM uses image segmentation and data classification techniques.Our main contributions in this thesis can be summarized in three points : - Analysis of the possible uses of different methods of extraction of local characteristics using the VLAD technique.- Proposed a new method for extracting dominant color information in an image.- Comparison of Support Vector Machines (SVM) to different classifiers for the proposed indexing keywords. These contributions have been tested and validated on summary data and on actual data. Our methods were then widely used in the DAM ePhoto system developed by the company EINDEN, which financed the CIFRE thesis in which this work was carried out. The results are encouraging and open new perspectives for research
Kessler, Rémy. "Traitement automatique d'informations appliqué aux ressources humaines". Phd thesis, Université d'Avignon, 2009. http://tel.archives-ouvertes.fr/tel-00453642.
Texto completo da fonteElgui, Kevin. "Contributions to RSSI-based geolocation". Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT047.
Texto completo da fonteThe Network-Based Geolocation has raised a great deal of attention in the context of the Internet of Things. In many situations, connected objects with low-consumption should be geolocated without the use of GPS or GSM. Geolocation techniques based on the Received Signal Strength Indicator (RSSI) stands out, because other location techniques may fail in the context of urban environments and/or narrow band signals. First, we propose some methods for the RSSI-based geolocation problem. The observation is a vector of RSSI received at the various base stations. In particular, we introduce a semi-parametric Nadaraya-Watson estimator of the likelihood, followed by a maximum a posteriori estimator of the object’s position. Experiments demonstrate the interest of the proposed method, both in terms of location estimation performance, and ability to build radio maps. An alternative approach is given by a k-nearest neighbors regressor which uses a suitable metric between RSSI vectors. Results also show that the quality of the prediction is highly related to the chosen metric. Therefore, we turn our attention to the metric learning problem. We introduce an original task-driven objective for learning a similarity between pairs of data points. The similarity is chosen as a sum of regression trees and is sequentially learned by means of a modified version of the so-called eXtreme Gradient Boosting algorithm (XGBoost). The last part of the thesis is devoted to the introduction of a Conditional Independence (CI) hypothesis test. The motivation is related to the fact that for many estimators, the components of the RSSI vectors are assumed independent given the position. The contribution is however provided in a general statistical framework. We introduce the weighted partial copula function for testing conditional independence. The proposed test procedure results from the following ingredients: (i) the test statistic is an explicit Cramér-von Mises transformation of the weighted partial copula, (ii) the regions of rejection are computed using a boot-strap procedure which mimics conditional independence by generating samples. Under the null hypothesis, the weak convergence of the weighted partial copula process is established and endorses the soundness of our approach
Berrahou, Soumia Lilia. "Extraction d'arguments de relations n-aires dans les textes guidée par une RTO de domaine". Thesis, Montpellier, 2015. http://www.theses.fr/2015MONTS019/document.
Texto completo da fonteToday, a huge amount of data is made available to the research community through several web-based libraries. Enhancing data collected from scientific documents is a major challenge in order to analyze and reuse efficiently domain knowledge. To be enhanced, data need to be extracted from documents and structured in a common representation using a controlled vocabulary as in ontologies. Our research deals with knowledge engineering issues of experimental data, extracted from scientific articles, in order to reuse them in decision support systems. Experimental data can be represented by n-ary relations which link a studied object (e.g. food packaging, transformation process) with its features (e.g. oxygen permeability in packaging, biomass grinding) and capitalized in an Ontological and Terminological Ressource (OTR). An OTR associates an ontology with a terminological and/or a linguistic part in order to establish a clear distinction between the term and the notion it denotes (the concept). Our work focuses on n-ary relation extraction from scientific documents in order to populate a domain OTR with new instances. Our contributions are based on Natural Language Processing (NLP) together with data mining approaches guided by the domain OTR. More precisely, firstly, we propose to focus on unit of measure extraction which are known to be difficult to identify because of their typographic variations. We propose to rely on automatic classification of texts, using supervised learning methods, to reduce the search space of variants of units, and then, we propose a new similarity measure that identifies them, taking into account their syntactic properties. Secondly, we propose to adapt and combine data mining methods (sequential patterns and rules mining) and syntactic analysis in order to overcome the challenging process of identifying and extracting n-ary relation instances drowned in unstructured texts
Akgül, Ceyhun Burak. "Descripteurs de forme basés sur la densité probabiliste et apprentissage des similarités pour la recherche d'objets 3D". Phd thesis, Télécom ParisTech, 2007. http://pastel.archives-ouvertes.fr/pastel-00003154.
Texto completo da fonteTrouvilliez, Benoît. "Similarités de données textuelles pour l'apprentissage de textes courts d'opinions et la recherche de produits". Thesis, Artois, 2013. http://www.theses.fr/2013ARTO0403/document.
Texto completo da fonteThis Ph.D. thesis is about the establishment of textual data similarities in the client relation domain. Two subjects are mainly considered : - the automatic analysis of short messages in response of satisfaction surveys ; - the search of products given same criteria expressed in natural language by a human through a conversation with a program. The first subject concerns the statistical informations from the surveys answers. The ideas recognized in the answers are identified, organized according to a taxonomy and quantified. The second subject concerns the transcription of some criteria over products into queries to be interpreted by a database management system. The number of criteria under consideration is wide, from simplest criteria like material or brand, until most complex criteria like color or price. The two subjects meet on the problem of establishing textual data similarities thanks to NLP techniques. The main difficulties come from the fact that the texts to be processed, written in natural language, are short ones and with lots of spell checking errors and negations. Establishment of semantic similarities between words (synonymy, antonymy, ...) and syntactic relations between syntagms (conjunction, opposition, ...) are other issues considered in our work. We also study in this Ph. D. thesis automatic clustering and classification methods in order to analyse answers to satisfaction surveys
Akgül, Ceyhun Burak. "Descripteurs de forme basés sur la densité de probabilité et apprentissage des similarités pour la recherche d'objets 3D". Paris, ENST, 2007. http://www.theses.fr/2007ENST0026.
Texto completo da fonteContent-based retrieval research aims at developing search engines that would allow users to perform a query by similarity of content. This thesis deals with two fundamentals problems in content-based 3D object retrieval : (1) How to describe a 3D shape to obtain a reliable representative for the subsequent task of similarity search? (2) How to supervise the search process to learn inter-shape similarities for more effective and semantic retrieval? Concerning the first problem, we develop a novel 3D shape description scheme based on probability density of multivariate local surface features. We constructively obtain local characterizations of 3D points and then summarize the resulting local shape information into a global shape descriptor. For probability density estimation, we use the general purpose kernel density estimation methodology, coupled with a fast approximation algorithm: the fast Gauss transform. Experiments that we have conducted on several 3D object databases show that density-based descriptors are very fast to compute and very effective for 3D similarity search. Concerning the second problem, we propose a similarity learning scheme. Our approach relies on combining multiple similarity scores by optimizing a convex regularized version of the empirical ranking risk criterion. This score fusion approach to similarity learning is applicable to a variety of search engine problems. In this work, we demonstrate its effectiveness in 3D object retrieval
Schutz, Georges. "Adaptations et applications de modèles mixtes de réseaux de neurones à un processus industriel". Phd thesis, Université Henri Poincaré - Nancy I, 2006. http://tel.archives-ouvertes.fr/tel-00115770.
Texto completo da fonteartificiels pour améliorer le contrôle de processus industriels
complexes, caractérisés en particulier par leur aspect temporel.
Les motivations principales pour traiter des séries temporelles
sont la réduction du volume de données, l'indexation pour la
recherche de similarités, la localisation de séquences,
l'extraction de connaissances (data mining) ou encore la
prédiction.
Le processus industriel choisi est un four à arc
électrique pour la production d'acier liquide au Luxembourg. Notre
approche est un concept de contrôle prédictif et se base sur des
méthodes d'apprentissage non-supervisé dans le but d'une
extraction de connaissances.
Notre méthode de codage se base sur
des formes primitives qui composent les signaux. Ces formes,
composant un alphabet de codage, sont extraites par une méthode
non-supervisée, les cartes auto-organisatrices de Kohonen (SOM).
Une méthode de validation des alphabets de codage accompagne
l'approche.
Un sujet important abordé durant ces recherches est
la similarité de séries temporelles. La méthode proposée est
non-supervisée et intègre la capacité de traiter des séquences de
tailles variées.
Zheng, Lilei. "Triangular similarity metric learning : A siamese architecture approach". Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI045/document.
Texto completo da fonteIn many machine learning and pattern recognition tasks, there is always a need for appropriate metric functions to measure pairwise distance or similarity between data, where a metric function is a function that defines a distance or similarity between each pair of elements of a set. In this thesis, we propose Triangular Similarity Metric Learning (TSML) for automatically specifying a metric from data. A TSML system is loaded in a siamese architecture which consists of two identical sub-systems sharing the same set of parameters. Each sub-system processes a single data sample and thus the whole system receives a pair of data as the input. The TSML system includes a cost function parameterizing the pairwise relationship between data and a mapping function allowing the system to learn high-level features from the training data. In terms of the cost function, we first propose the Triangular Similarity, a novel similarity metric which is equivalent to the well-known Cosine Similarity in measuring a data pair. Based on a simplified version of the Triangular Similarity, we further develop the triangular loss function in order to perform metric learning, i.e. to increase the similarity between two vectors in the same class and to decrease the similarity between two vectors of different classes. Compared with other distance or similarity metrics, the triangular loss and its gradient naturally offer us an intuitive and interesting geometrical interpretation of the metric learning objective. In terms of the mapping function, we introduce three different options: a linear mapping realized by a simple transformation matrix, a nonlinear mapping realized by Multi-layer Perceptrons (MLP) and a deep nonlinear mapping realized by Convolutional Neural Networks (CNN). With these mapping functions, we present three different TSML systems for various applications, namely, pairwise verification, object identification, dimensionality reduction and data visualization. For each application, we carry out extensive experiments on popular benchmarks and datasets to demonstrate the effectiveness of the proposed systems
Kessler, Rémy. "Traitement automatique d’informations appliqué aux ressources humaines". Thesis, Avignon, 2009. http://www.theses.fr/2009AVIG0167/document.
Texto completo da fonteSince the 90s, Internet is at the heart of the labor market. First mobilized on specific expertise, its use spreads as increase the number of Internet users in the population. Seeking employment through "electronic employment bursary" has become a banality and e-recruitment something current. This information explosion poses various problems in their treatment with the large amount of information difficult to manage quickly and effectively for companies. We present in this PhD thesis, the work we have developed under the E-Gen project, which aims to create tools to automate the flow of information during a recruitment process.We interested first to the problems posed by the routing of emails. The ability of a companie to manage efficiently and at lower cost this information flows becomes today a major issue for customer satisfaction. We propose the application of learning methods to perform automatic classification of emails to their routing, combining technical and probabilistic vector machines support. After, we present work that was conducted as part of the analysis and integration of a job ads via Internet. We present a solution capable of integrating a job ad from an automatic or assisted in order to broadcast it quickly. Based on a combination of classifiers systems driven by a Markov automate, the system gets very good results. Thereafter, we present several strategies based on vectorial and probabilistic models to solve the problem of profiling candidates according to a specific job offer to assist recruiters. We have evaluated a range of measures of similarity to rank candidatures by using ROC curves. Relevance feedback approach allows to surpass our previous results on this task, difficult, diverse and higly subjective
Chebbi, Mohamed Ali. "Similarity learning for large scale dense image matching". Electronic Thesis or Diss., Université Gustave Eiffel, 2024. http://www.theses.fr/2024UEFL2030.
Texto completo da fonteDense image matching is a long standing ill-posed problem. Despite the extensive research efforts undertaken in the last twenty years, the state-of-the-art handcrafted algorithms perform poorly on featureless areas, in presence of occlusions, shadows and on non-lambertian surfaces. This is due to lack of distinctiveness of the handcrafted similarity metrics in such challenging scenarios. On the other hand, deep learning based approaches to image matching are able to learn highly non-linear similarity functions thus provide an interesting path to addressing such complex matching scenarios.In this research, we present deep learning based architectures and methods for stereo and multi-view dense image matching tailored to aerial and satellite photogrammetry. The proposed approach is driven by two key ideas. First, our goal is to develop a matching network that is as generic as possible to different sensors and acquisition scenarios. Secondly, we argue that known geometrical relationships between images can alleviate the learning phase and should be leveraged in the process. As a result, our matching pipeline follows the known two step pipeline where we first compute deep similarities between pixel correspondences, followed by depth regularization. This separation ensures “generality” or “transferability” to different scenes and acquisitions. Furthermore, our similarity functions are learnt on epipolar rectified image pairs, and to exploit the learnt embeddings in a general n-view matching problem, geometry priors are mobilized. In other words, we transform embeddings learnt on pairs of images to multi-view embeddings through a priori knowledge about the relative camera poses. This allows us to capitalize on the vast stereo matching benchmarks existing in the literature while extending the approach to multi-view scenarios. Finally, we tackle the insufficient distinctiveness of the state-of-the-art patch-based features/similarities by feeding the network with large images thus adding more context, and by proposing an adapted sample mining scheme. We establish a middle-ground between state-of-the-art similarity learning and end-to-end regression models for stereo matching and demonstrate that our models yield generalizable representations in multiple view 3D surface reconstruction from aerial and satellite acquisitions. The proposed pipelines are implemented in MicMac, a free, open-source photogrammetric software
Morvant, Emilie. "Apprentissage de vote de majorité pour la classification supervisée et l'adaptation de domaine : approches PAC-Bayésiennes et combinaison de similarités". Phd thesis, Aix-Marseille Université, 2013. http://tel.archives-ouvertes.fr/tel-00879072.
Texto completo da fonteMorbieu, Stanislas. "Leveraging textual embeddings for unsupervised learning". Electronic Thesis or Diss., Université Paris Cité, 2020. http://www.theses.fr/2020UNIP5191.
Texto completo da fonteTextual data is ubiquitous and is a useful information pool for many companies. In particular, the web provides an almost inexhaustible source of textual data that can be used for recommendation systems, business or technological watch, information retrieval, etc. Recent advances in natural language processing have made possible to capture the meaning of words in their context in order to improve automatic translation systems, text summary, or even the classification of documents according to predefined categories. However, the majority of these applications often rely on a significant human intervention to annotate corpora: This annotation consists, for example in the context of supervised classification, in providing algorithms with examples of assigning categories to documents. The algorithm therefore learns to reproduce human judgment in order to apply it for new documents. The object of this thesis is to take advantage of these latest advances which capture the semantic of the text and use it in an unsupervised framework. The contributions of this thesis revolve around three main axes. First, we propose a method to transfer the information captured by a neural network for co-clustering of documents and words. Co-clustering consists in partitioning the two dimensions of a data matrix simultaneously, thus forming both groups of similar documents and groups of coherent words. This facilitates the interpretation of a large corpus of documents since it is possible to characterize groups of documents by groups of words, thus summarizing a large corpus of text. More precisely, we train the Paragraph Vectors algorithm on an augmented dataset by varying the different hyperparameters, classify the documents from the different vector representations and apply a consensus algorithm on the different partitions. A constrained co-clustering of the co-occurrence matrix between terms and documents is then applied to maintain the consensus partitioning. This method is found to result in significantly better quality of document partitioning on various document corpora and provides the advantage of the interpretation offered by the co-clustering. Secondly, we present a method for evaluating co-clustering algorithms by exploiting vector representations of words called word embeddings. Word embeddings are vectors constructed using large volumes of text, one major characteristic of which is that two semantically close words have word embeddings close by a cosine distance. Our method makes it possible to measure the matching between the partition of the documents and the partition of the words, thus offering in a totally unsupervised setting a measure of the quality of the co-clustering. Thirdly, we are interested in recommending classified ads. We present a system that allows to recommend similar classified ads when consulting one. The descriptions of classified ads are often short, syntactically incorrect, and the use of synonyms makes it difficult for traditional systems to accurately measure semantic similarity. In addition, the high renewal rate of classified ads that are still valid (product not sold) implies choices that make it possible to have low computation time. Our method, simple to implement, responds to this use case and is again based on word embeddings. The use of these has advantages but also involves some difficulties: the creation of such vectors requires choosing the values of some parameters, and the difference between the corpus on which the word embeddings were built upstream. and the one on which they are used raises the problem of out-of-vocabulary words, which have no vector representation. To overcome these problems, we present an analysis of the impact of the different parameters on word embeddings as well as a study of the methods allowing to deal with the problem of out-of-vocabulary words
Michel, Fabrice. "Multi-Modal Similarity Learning for 3D Deformable Registration of Medical Images". Phd thesis, Ecole Centrale Paris, 2013. http://tel.archives-ouvertes.fr/tel-01005141.
Texto completo da fonteCerda, Reyes Patricio. "Apprentissage statistique à partir de variables catégorielles non-uniformisées Similarity encoding for learning with dirty categorical variables Encoding high-cardinality string categorical variables". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS470.
Texto completo da fonteTabular data often contain columns with categorical variables, usually considered as non-numerical entries with a fixed and limited number of unique elements or categories. As many statistical learning algorithms require numerical representations of features, an encoding step is necessary to transform categorical entries into feature vectors, using for instance one-hot encoding. This and other similar strategies work well, in terms of prediction performance and interpretability, in standard statistical analysis when the number of categories is small. However, non-curated data give rise to string categorical variables with a very high cardinality and redundancy: the string entries share semantic and/or morphological information, and several entries can reflect the same entity. Without any data cleaning or feature engineering step, common encoding methods break down, as they tend to lose information in their vectorial representation. Also, they can create high-dimensional feature vectors, which prevent their usage in large scale settings. In this work, we study a series of categorical encodings that remove the need for preprocessing steps on high-cardinality string categorical variables. An ideal encoder should be: scalable to many categories; interpretable to end users; and capture the morphological information contained in the string entries. Experiments on real and simulated data show that the methods we propose improve supervised learning, are adapted to large-scale settings, and, in some cases, create feature vectors that are easily interpretable. Hence, they can be applied in Automated Machine Learning (AutoML) pipelines in the original string entries without any human intervention
Hoffmann, Brice. "Développement d'approches de chémogénomique pour la prédiction des interactions protéine - ligand". Phd thesis, École Nationale Supérieure des Mines de Paris, 2011. http://pastel.archives-ouvertes.fr/pastel-00679718.
Texto completo da fonteCuan, Bonan. "Deep similarity metric learning for multiple object tracking". Thesis, Lyon, 2019. http://www.theses.fr/2019LYSEI065.
Texto completo da fonteMultiple object tracking, i.e. simultaneously tracking multiple objects in the scene, is an important but challenging visual task. Objects should be accurately detected and distinguished from each other to avoid erroneous trajectories. Since remarkable progress has been made in object detection field, “tracking-by-detection” approaches are widely adopted in multiple object tracking research. Objects are detected in advance and tracking reduces to an association problem: linking detections of the same object through frames into trajectories. Most tracking algorithms employ both motion and appearance models for data association. For multiple object tracking problems where exist many objects of the same category, a fine-grained discriminant appearance model is paramount and indispensable. Therefore, we propose an appearance-based re-identification model using deep similarity metric learning to deal with multiple object tracking in mono-camera videos. Two main contributions are reported in this dissertation: First, a deep Siamese network is employed to learn an end-to-end mapping from input images to a discriminant embedding space. Different metric learning configurations using various metrics, loss functions, deep network structures, etc., are investigated, in order to determine the best re-identification model for tracking. In addition, with an intuitive and simple classification design, the proposed model achieves satisfactory re-identification results, which are comparable to state-of-the-art approaches using triplet losses. Our approach is easy and fast to train and the learned embedding can be readily transferred onto the domain of tracking tasks. Second, we integrate our proposed re-identification model in multiple object tracking as appearance guidance for detection association. For each object to be tracked in a video, we establish an identity-related appearance model based on the learned embedding for re-identification. Similarities among detected object instances are exploited for identity classification. The collaboration and interference between appearance and motion models are also investigated. An online appearance-motion model coupling is proposed to further improve the tracking performance. Experiments on Multiple Object Tracking Challenge benchmark prove the effectiveness of our modifications, with a state-of-the-art tracking accuracy
André, Barbara. "Atlas intelligent pour guider le diagnostic en endomicroscopie : une application clinique de la reconnaissance d'images par le contenu". Phd thesis, École Nationale Supérieure des Mines de Paris, 2011. http://pastel.archives-ouvertes.fr/pastel-00640899.
Texto completo da fonteNgo, Duy Hoa. "Amélioration de l'alignement d'ontologies par les techniques d'apprentissage automatique, d'appariement de graphes et de recherche d'information". Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2012. http://tel.archives-ouvertes.fr/tel-00767318.
Texto completo da fonteLe, Boudic-Jamin Mathilde. "Similarités et divergences, globales et locales, entre structures protéiques". Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S119/document.
Texto completo da fonteThis thesis focusses on local and global similarities and divergences inside protein structures. First, structures are scored, with criteria of similarity and distance in order to provide a supervised classification. This structural domain classification inside existing hierarchical databases is possible by using dominances and learning. These methods allow to assign new domains with accuracy and exactly. Second we focusses on local similarities and proposed a method of protein comparison modelisation inside graphs. Graph traversal allows to find protein similar substructures. This method is based on compatibility between elements and criterion of distances. We can use it and detect events such that circular permutations, hinges and structural motif repeats. Finally we propose a new approach of accurate protein structure analysis that focused on divergences between similar structures
Combier, Camille. "Mesures de similarité pour cartes généralisées". Phd thesis, Université Claude Bernard - Lyon I, 2012. http://tel.archives-ouvertes.fr/tel-00995382.
Texto completo da fonteMiry, Erwan. "Similarité statistique pour le CBR textuel". Thesis, Université Laval, 2007. http://www.theses.ulaval.ca/2007/24972/24972.pdf.
Texto completo da fonteE-mails have recently become a popular mean of communication for exchanges between companies and their customers. However the increasing volume of messages makes manual processing difficult to achieve and automatic methods are foreseen as a more efficient solution. Automatic management systems help users in the processing of the messages and in the creation of a response from the messages kept in the company databases. One important question in this type of application is how to select existing e-mails to respond to a new request. The creation of new response messages requires texts pertaining to the new request topics. Finding similarity between documents is also an important task. Our goal for this research effort was to study how to detect similarity between small documents. To accomplish it, we followed a two-pronged approach: - finding similarity between words in order to augment a document’s vocabulary; - estimating similarity between documents, using all the similar words resulting from the previous step. We dedicated our work to determine the most interesting techniques to detect textual similarity between documents, and to improve those techniques using cooccurrences detection and lexical semantic similarity. During our experimentations, we tried different combinations, using cooccurrences detection and lexical similarity. We proposed techniques to augment the vocabulary of each message, based on different kind of reasoning to improve the estimation of similarity between documents. Our results indicate that the proposed augmentation techniques improve significantly the estimation of document similarity. The best results were obtained when using a combination of cooccurrences filter and cosine metric. However our experiments clearly indicate these results do not overcome the performance of similarity techniques based on tf*idf weights.
Haidar, Siba. "Comparaison des documents audiovisuelspar Matrice de Similarité". Phd thesis, Université Paul Sabatier - Toulouse III, 2005. http://tel.archives-ouvertes.fr/tel-00011510.
Texto completo da fonteDes approches classiques de comparaison se basent essentiellement sur l'ensemble des caractéristiques
bas niveaux des documents à comparer, en les considérant comme des vecteurs multidimensionnels. D'autres approches se basent sur la similarité des images composant la vidéo sans tenir compte de la composition temporelle du document ni de la bande
son. Le défaut que l'on peut reprocher à ces méthodes est qu'elles restreignent la comparaison à un simple opérateur binaire robuste au bruit. De tels opérateurs sont généralement utilisés afin d'identifier les différents exemplaires d'un même document. L'originalité de notre démarche réside dans le fait que nous introduisons la notion de la similarité de style
en s'inspirant des critères humains dans la comparaison des documents vidéo. Ces critères
sont plus souples, et n'imposent pas une similarité stricte de toutes les caractéristiques étudiées
à la fois.
En nous inspirant de la programmation dynamique et de la comparaison des séries chronologiques, nous définissons un algorithme d'extraction des similarités entre les séries de valeurs produites par l'analyse de caractéristiques audiovisuelles de bas-niveau. Ensuite, un second traitement générique approxime le résultat de l'algorithme de la longueur de la Plus
Longue Sous-Séquence Commune (PLSC) plus rapidement que ce dernier. Nous proposons une représentation des données issues de ces traitements sous la forme d'un schéma matriciel propre à la comparaison visuelle et immédiate de deux contenus. Cette matrice peut être également utilisée pour définir une mesure de similarité générique, applicable à des documents de même genre ou de genres hétérogènes.
Plusieurs applications ont été mises en place pour démontrer le comportement de la méthode de comparaison et de la mesure de similarité, ainsi que leur pertinence. Les expérimentations concernent essentiellement : - l'identification d'une structure organisationnelle en collection / sous-collection d'une base de documents, - la mise en évidence d'éléments
stylistiques dans un film de cinéma, - la mise en évidence de la grille de programmes d'un
flux de télévision.
Chilowicz, Michel. "Recherche de similarité dans du code source". Phd thesis, Université Paris-Est, 2010. http://tel.archives-ouvertes.fr/tel-00587628.
Texto completo da fonteOmhover, Jean-François. "Recherche d'images par similarité de contenus régionaux". Paris 6, 2004. http://www.theses.fr/2004PA066254.
Texto completo da fonteDumont, Émilie. "Similarité des séquences vidéo : application aux rushes". Nice, 2009. http://www.theses.fr/2009NICE4021.
Texto completo da fonteThe purpose of this document is video analysis and in particular analysis of video rushes. In filmmaking, rushes is the term used to describe the raw, unedited, footage shots which are created during the making of a motion picture. We propose several tools to explore rushes. The first one is a tool to remove redundancy : the redundancy can be absolute (i. E. The content is not needed) or relative (i. E. The content is repetitive). An other method is a shot video search using a visual dictionary based on the paradigm of textual document search. In order to create video summarization, we propose a method to represent the quantity of the relevant visual content of a video sequence. A second technique is to align repetitive video sequences in order to parse the video and remove repetitive takes. At the same time, we present a collaborative architecture allowing to fuse different partner analysis in order to exploit their different competences. These systems were evaluated by TRECVID. Results encouraged us to continue on this direction. The main problem is that the TRECVID evaluations are currently performed by human judges. This creates fundamental difficulties because evaluation experiments are expensive to reproduce, and subject to the variability of human judgment. Therefore, we propose an approach to automate this evaluation procedure using the same quality criteria. Through experiments, we show a good correlation with the manual evaluation
Hoffmann, Patrick. "Similarité sémantique inter ontologies basée sur le contexte". Phd thesis, Université Claude Bernard - Lyon I, 2008. http://tel.archives-ouvertes.fr/tel-00363300.
Texto completo da fonteNous proposons une méthodologie pour déterminer, modeler et utiliser le contexte. En l'appliquant, nous découvrons trois usages du contexte qui contribuent à améliorer la réconciliation d'ontologies : Nous proposons de désambiguïser les sens pragmatiques possibles des concepts en comparant les "perspectives" avec lesquelles les concepts ont été développés ; de personnaliser en considérant le contexte des agents, constitué d'une sélection pertinente parmi les domaines et tâches de l'organisation ; d'évaluer la pertinence des données associées au concept pour la tâche qui a suscité le besoin en interopérabilité.
Haidar, Siba. "Comparaison des documents audiovisuels par matrice de similarité". Toulouse 3, 2005. http://www.theses.fr/2005TOU30078.
Texto completo da fonteThe work of this thesis relates to the comparison of video documents. The field of digital video is in full expansion. Videos are now present in large quantity even for personal use. The video comparison is a basic analysis operation in complement of classification, extraction and structuring of videos. Traditional approaches of comparison are primarily based on the low-level features of the videos to be compared, considered as multidimensional vectors. Other approaches are based on the similarity of frames without taking into account neither the temporal composition of the video nor the audio layer. The main disadvantage of these methods is that they reduce the comparison role to a simple operator robust to noise effects. The originality of our approach lies in the introduction of the of style similarity notion, taking as a starting point the human criteria into the comparison. These criteria are more flexible, and do not impose a strict similarity of all the studied features at the same time. We define an algorithm of extraction of the similarities between the audiovisual low-level features. The algorithm is inspired by the dynamic programming and the time series comparison methods. We propose a representation of the data resulting from this processing in the form of a matrix pattern suitable for the visual and immediate comparison of two videos. This matrix is then used to propose a generic similarity measure. We developed several applications to demonstrate the behavior of the comparison method and the similarity measure
Petitcunot, Pierre. "Problèmes de similarité et spectre étendu d'un opérateur". Thesis, Lille 1, 2008. http://www.theses.fr/2008LIL10046/document.
Texto completo da fonteLn this thesis, we study some similarity problems and the extended spectrum of an operator. ln the first part, we give criteria of similarity to some classes of partial isometries. For example, we obtain the following result. Let T be an operator on H an Hilbert space. T is similar to the direct sum of a Jordan operator and an isometry if and only if T is power-bounded, T has a finite as cent and there exists a power~bounded operator S E B(H) so that TnsnTn = Tn, for all n of No This results can be seen as partial results to an open problem of Badea and Mbekhta (2005) . ln the second part, we obtain a criterion of joint similarity to two contractions that we apply to have results of pertubation of operators jointly similar to contractions. The extended spectrum is the subject of the last part. Some of its links with other spectra of an operator are proposed before studying the behaviour of the extended spectrum of sorne classes of operators. Finally we use the extended spectrum to give criteria of hypercyclicity that we will compare to a criterion of Godefroy and Shapiro
D'Arcy, Jean-François. "Effets mnésiques sur la similarité et l'apprentissage de catégories". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ32609.pdf.
Texto completo da fonteKupin, Stanislav. "Similarité à un opérateur normal et certains problèmes d'interpolation". Bordeaux 1, 2000. http://www.theses.fr/2000BOR10524.
Texto completo da fonteShortridge-Baillot, Joan. "Similarité et distincitivité en mémoire à court terme verbale". Grenoble 2, 1999. http://www.theses.fr/1999GRE29036.
Texto completo da fonte