Dissertations / Theses on the topic 'Découverte basée sur les données'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Découverte basée sur les données.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Pierret, Jean-Dominique. "Méthodologie et structuration d'un outil de découverte de connaissances basé sur la littérature biomédicale : une application basée sur l'exploitation du MeSH." Toulon, 2006. http://tel.archives-ouvertes.fr/tel-00011704.
Full textThe information available in bibliographic databases is dated and validated by a long process and becomes not very innovative. Usually bibliographic databases are consultated in a boolean way. The result of a request represente is a set of known which do not bring any additional novelty. In 1985 Don Swanson proposed an original method to draw out innovative information from bibliographic databases. His reasoning is based on systematic use of the biomedical literature to draw the latent connections between different well established knowledges. He demonstrated unsuspected potential of bibliographic databases in knowledge discovery. The value of his work did not lie in the nature of the available information but consisted in the methodology he used. This general methodology was mainly applied on validated and structured information that is bibliographic information. We propose to test the robustness of Swanson's theory by setting out the methods inspired by this theory. These methods led to the same conclusions as Don Swanson's ones. Then we explain how we developed a knowledge discovery system based on the literature available from public biomedical information sources
Pierret, Jean-Dominique. "Methodologie et structuration d'un outil de decouverte de connaissances base sur la litterture biomedicale : une application basee sur le MeSH." Phd thesis, Université du Sud Toulon Var, 2006. http://tel.archives-ouvertes.fr/tel-00011704.
Full textPourtant, en 1985, Don Swanson propose une méthode originale pour extraire de bases de donnés une information innovante. Son raisonnement est basé sur une exploitation systématique de la littérature biomédicale afin de dégager des connexions latentes entre différentes connaissances bien établies. Ses travaux montrent le potentiel insoupçonné des bases bibliographiques dans la révélation et la découverte de connaissances. Cet intérêt ne tient pas tant à la nature de l'information disponible qu'à la méthodologie utilisée. Cette méthodologie générale s'applique de façon privilégiée dans un environnement d'information validée et structurée ce qui est le cas de l'information bibliographique. Nous proposons de tester la robustesse de la théorie de Swanson en présentant les méthodes qu'elle a inspirées et qui conduisent toutes aux mêmes conclusions. Nous exposons ensuite, comment à partir de sources d'information biomédicales publiques, nous avons développé un système de découverte de connaissances basé sur la littérature.
Jiao, Yunlong. "Pronostic moléculaire basé sur l'ordre des gènes et découverte de biomarqueurs guidé par des réseaux pour le cancer du sein." Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEM027/document.
Full textBreast cancer is the second most common cancer worldwide and the leading cause of women's death from cancer. Improving cancer prognosis has been one of the problems of primary interest towards better clinical management and treatment decision making for cancer patients. With the rapid advancement of genomic profiling technologies in the past decades, easy availability of a substantial amount of genomic data for medical research has been motivating the currently popular trend of using computational tools, especially machine learning in the era of data science, to discover molecular biomarkers regarding prognosis improvement. This thesis is conceived following two lines of approaches intended to address two major challenges arising in genomic data analysis for breast cancer prognosis from a methodological standpoint of machine learning: rank-based approaches for improved molecular prognosis and network-guided approaches for enhanced biomarker discovery. Furthermore, the methodologies developed and investigated in this thesis, pertaining respectively to learning with rank data and learning on graphs, have a significant contribution to several branches of machine learning, concerning applications across but not limited to cancer biology and social choice theory
Ermolaev, Andrei. "Data-driven methods for analysing nonlinear propagation in optical fibres." Electronic Thesis or Diss., Bourgogne Franche-Comté, 2024. http://www.theses.fr/2024UBFCD020.
Full textThis thesis aims to apply machine learning methods specifically tailored to the analysis and interpretation of optical pulses as they propagate in a single pass through an optical fiber, and under a variety of conditions. In particular, we will focus on data-driven model discovery approaches that involve the use of machine learning to analyze physical system data with the aim of discovering interpretable and generalizable models and developing methods that can substantially accomplish and/or complement conventional theoretical analysis. To this end, both supervised and unsupervised learning methods will be used to deepen understanding of ultrafast nonlinear phenomena in fiber optics systems
Marie, Nicolas. "Recherche exploratoire basée sur des données liées." Thesis, Nice, 2014. http://www.theses.fr/2014NICE4129/document.
Full textThe general topic of the thesis is web search. It focused on how to leverage the data semantics for exploratory search. Exploratory search refers to cognitive consuming search tasks that are open-ended, multi-faceted, and iterative like learning or topic investigation. Semantic data and linked data in particular offer new possibilities to solve complex search queries and information needs including exploratory search ones. In this context the linked open data cloud plays an important role by allowing advanced data processing and innovative interactions model elaboration. First, we detail a state-of-the-art review of linked data based exploratory search approaches and systems. Then we propose a linked data based exploratory search solution which is mainly based on an associative retrieval algorithm. We started from a spreading activation algorithm and proposed new diffusion formula optimized for typed graph. Starting from this formalization we proposed additional formalizations of several advanced querying modes in order to solve complex exploratory search needs. We also propose an innovative software architecture based on two paradigmatic design choices. First the results have to be computed at query-time. Second the data are consumed remotely from distant SPARQL endpoints. This allows us to reach a high level of flexibility in terms of querying and data selection. We specified, designed and evaluated the Discovery Hub web application that retrieves the results and present them in an interface optimized for exploration. We evaluate our approach thanks to several human evaluations and we open the discussion about new ways to evaluate exploratory search engines
Vigneron, Vincent. "Programmation par contraintes et découverte de motifs sur données séquentielles." Thesis, Angers, 2017. http://www.theses.fr/2017ANGE0028/document.
Full textRecent works have shown the relevance of constraint programming to tackle data mining tasks. This thesis follows this approach and addresses motif discovery in sequential data. We focus in particular, in the case of classified sequences, on the search for motifs that best fit each individual class. We propose a language of constraints over matrix domains to model such problems. The language assumes a preprocessing of the data set (e.g., by pre-computing the locations of each character in each sequence) and views a motif as the choice of a sub-matrix (i.e., characters, sequences, and locations). We introduce different matrix constraints (compatibility of locations with the database, class covering, location-based character ordering common to sequences, etc.) and address two NP-complete problems: the search for class-specific totally ordered motifs (e.g., exclusive subsequences) or partially ordered motifs. We provide two CSP models that rely on global constraints to prove exclusivity. We then present a memetic algorithm that uses this CSP model during initialisation and intensification. This hybrid approach proves competitive compared to the pure CSP approach as shown by experiments carried out on protein sequences. Lastly, we investigate data set preprocessing based on patterns rather than characters, in order to reduce the size of the resulting matrix domain. To this end, we present and compare two alternative methods, one based on lattice search, the other on dynamic programming
Tanasescu, Adrian. "Vers un accès sémantique aux données : approche basée sur RDF." Lyon 1, 2007. http://www.theses.fr/2007LYO10069.
Full textThe thesis mainly focuses on information retrival through RDF documents querying. Therefore, we propose an approach able to provide complete and pertinent answers to a user formulated SPARQL query. The approach mainly consists of (1) determining, through a similarity measure, whether two RDF graphs are contradictory, by using the associated ontological knowledge, and (2) building pertinent answers through the combination of statements belonging to non contradicting RDF graphs that partially answer a given query. We also present an RDF storage and querying platform, named SyRQuS, whose query answering plan is entirely based on the former proposed querying approach. SyRQuS is a Web based plateform that mainly provides users with a querying interface where queries can be formulated using SPARQL
Olmos, Marchant Luis Felipe. "Modélisation de performance des caches basée sur l'analyse de données." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLX008/document.
Full textThe need to distribute massive quantities of multimedia content to multiple users has increased tremendously in the last decade. The current solution to this ever-growing demand are Content Delivery Networks, an application layer architecture that handle nowadays the majority of multimedia traffic. This distribution problem has also motivated the study of new solutions such as the Information Centric Networking paradigm, whose aim is to add content delivery capabilities to the network layer by decoupling data from its location. In both architectures, cache servers play a key role, allowing efficient use of network resources for content delivery. As a consequence, the study of cache performance evaluation techniques has found a new momentum in recent years.In this dissertation, we propose a framework for the performance modeling of a cache ruled by the Least Recently Used (LRU) discipline. Our framework is data-driven since, in addition to the usual mathematical analysis, we address two additional data-related problems: The first is to propose a model that a priori is both simple and representative of the essential features of the measured traffic; the second, is the estimation of the model parameters starting from traffic traces. The contributions of this thesis concerns each of the above tasks.In particular, for our first contribution, we propose a parsimonious traffic model featuring a document catalog evolving in time. We achieve this by allowing each document to be available for a limited (random) period of time. To make a sensible proposal, we apply the "semi-experimental" method to real data. These "semi-experiments" consist in two phases: first, we randomize the traffic trace to break specific dependence structures in the request sequence; secondly, we perform a simulation of an LRU cache with the randomized request sequence as input. For candidate model, we refute an independence hypothesis if the resulting hit probability curve differs significantly from the one obtained from original trace. With the insights obtained, we propose a traffic model based on the so-called Poisson cluster point processes.Our second contribution is a theoretical estimation of the cache hit probability for a generalization of the latter model. For this objective, we use the Palm distribution of the model to set up a probability space whereby a document can be singled out for the analysis. In this setting, we then obtain an integral formula for the average number of misses. Finally, by means of a scaling of system parameters, we obtain for the latter expression an asymptotic expansion for large cache size. This expansion quantifies the error of a widely used heuristic in literature known as the "Che approximation", thus justifying and extending it in the process.Our last contribution concerns the estimation of the model parameters. We tackle this problem for the simpler and widely used Independent Reference Model. By considering its parameter (a popularity distribution) to be a random sample, we implement a Maximum Likelihood method to estimate it. This method allows us to seamlessly handle the censor phenomena occurring in traces. By measuring the cache performance obtained with the resulting model, we show that this method provides a more representative model of data than typical ad-hoc methodologies
Soulet, Arnaud. "Un cadre générique de découverte de motifs sous contraintes fondées sur des primitives." Phd thesis, Université de Caen, 2006. http://tel.archives-ouvertes.fr/tel-00123185.
Full textl'extraction de connaissances dans les bases de données. Cette thèse
traite de l'extraction de motifs locaux sous contraintes. Nous
apportons un éclairage nouveau avec un cadre combinant des primitives
monotones pour définir des contraintes quelconques. La variété de ces
contraintes exprime avec précision l'archétype des motifs recherchés
par l'utilisateur au sein d'une base de données. Nous proposons alors
deux types d'approche d'extraction automatique et générique malgré les
difficultés algorithmiques inhérentes à cette tâche. Leurs efficacités
reposent principalement sur l'usage de conditions nécessaires pour
approximer les variations de la contrainte. D'une part, des méthodes
de relaxations permettent de ré-utiliser les nombreux algorithmes
usuels du domaines. D'autre part, nous réalisons des méthodes
d'extraction directes dédiées aux motifs ensemblistes pour les données
larges ou corrélées en exploitant des classes d'équivalences. Enfin,
l'utilisation de nos méthodes ont permi la découverte de phénomènes
locaux lors d'applications industrielles et médicales.
Chamekh, Fatma. "L’évolution du web de données basée sur un système multi-agents." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE3083/document.
Full textIn this thesis, we investigate the evolution of RDF datasets from documents and LOD. We identify the following issues : the integration of new triples, the proposition of changes by taking into account the data quality and the management of differents versions.To handle with the complexity of the web of data evolution, we propose an agent based argumentation framework. We assume that the agent specifications could facilitate the process of RDF dataset evolution. The agent technology is one of the most useful solution to cope with a complex problem. The agents work as a team and are autonomous in the sense that they have the ability to decide themselves which goals they should adopt and how these goals should be acheived. The Agents use argumentation theory to reach a consensus about the best change alternative. Relatively to this goal, we propose an argumentation model based on the metric related to the intrinsic dimensions.To keep a record of all the occured modifications, we are focused on the ressource version. In the case of a collaborative environment, several conflicts could be generated. To manage those conflicts, we define rules.The exploited domain is general medecine
Lopez-Enriquez, Carlos-Manuel. "HyQoZ - Optimisation de requêtes hybrides basée sur des contrats SLA." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM060/document.
Full textToday we are witnesses of the explosion of data producer massively by largely distributed of data produced by different devices (e.g. sensors, personal computers, laptops, networks) by means of data services. In this context, It is about evaluate queries named hybrid because they entails aspects related with classic queries, mobile and continuous provided by static or nomad data services in mode push or pull. The objective of my thesis is to propose an approach to optimize hybrid queries based in multi-criteria preferences (i.e. SLA – Service Level Agreement). The principle is to combine data services to construct a query evaluator adapted to the preferences expressed in the SLA whereas the state of services and network is considered as QoS measures
Ahmad, Houda. "Une approche matérialisée basée sur les vues pour l'intégration de documents XML." Phd thesis, Grenoble 1, 2009. http://www.theses.fr/2009GRE10086.
Full textSemi-structured data play an increasing role in the development of the Web through the use ofXML. However, the management of semi-structured data poses specific problems because semi-structured data, contrary to classical databases, do not rely on a predefined schema. The schema of a document is contained in the document itself and similar documents may be represented by different schemas. Consequently, the techniques and algorithms used for querying or integrating this data are more complex than those used for structured data. The objective of our work is the integration of XML data by using the principles of Osiris, a prototype of KB-DBMS, in which views are a central concept. Ln this system, a family of objects is defined by a hierarchy of views, where a view is defined by its parent views and its own attributes and constraints. Osiris belongs to the family of Description Logics; the minimal view of a family of objects is assimilated to a primitive concept and its other views to defined concepts. An object of a family satisfies sorne ofits views. For each family of objects, Osiris builds a n-dimensional classification space by analysing the constraints defined in all of its views. This space is used for object classification and indexation. Ln this the sis we study the contribution of the main features of Osiris - classification, indexation and semantic query optimization - to the integration ofXML documents. For this purpose we produce a target schema (an abstract XML schema), who represents an Osiris schema; every document satisfying a source schema (concrete XML schema) is rewritten in terrns of the target schema before undergoing the extraction of the values ofits entities. The objects corresponding to these entities are then classified and indexed. The Osiris mechanism for semantic query optimization can then be used to extract the objects of interest of a query
Ahmad, Houda. "Une approche matérialisée basée sur les vues pour l'intégration de documents XML." Phd thesis, Université Joseph Fourier (Grenoble), 2009. http://tel.archives-ouvertes.fr/tel-00957148.
Full textBen, Ellefi Mohamed. "La recommandation des jeux de données basée sur le profilage pour le liage des données RDF." Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT276/document.
Full textWith the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet specific criteria, has become an increasingly important, yet challenging task to supportissues such as entity retrieval or semantic search and data linking. Particularlywith respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and efficient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to the semantic web tradition in dealing with "finding candidate datasets to link to", where data publishers are used to identify target datasets for interlinking.While an understanding of the nature of the content of specific datasets is a crucial prerequisite for the mentioned issues, we adopt in this dissertation the notion of "dataset profile" - a set of features that describe a dataset and allow the comparison of different datasets with regard to their represented characteristics. Our first research direction was to implement a collaborative filtering-like dataset recommendation approach, which exploits both existing dataset topic proles, as well as traditional dataset connectivity measures, in order to link LOD datasets into a global dataset-topic-graph. This approach relies on the LOD graph in order to learn the connectivity behaviour between LOD datasets. However, experiments have shown that the current topology of the LOD cloud group is far from being complete to be considered as a ground truth and consequently as learning data.Facing the limits the current topology of LOD (as learning data), our research has led to break away from the topic proles representation of "learn to rank" approach and to adopt a new approach for candidate datasets identication where the recommendation is based on the intensional profiles overlap between differentdatasets. By intensional profile, we understand the formal representation of a set of schema concept labels that best describe a dataset and can be potentially enriched by retrieving the corresponding textual descriptions. This representation provides richer contextual and semantic information and allows to compute efficiently and inexpensively similarities between proles. We identify schema overlap by the help of a semantico-frequential concept similarity measure and a ranking criterion based on the tf*idf cosine similarity. The experiments, conducted over all available linked datasets on the LOD cloud, show that our method achieves an average precision of up to 53% for a recall of 100%. Furthermore, our method returns the mappings between the schema concepts across datasets, a particularly useful input for the data linking step.In order to ensure a high quality representative datasets schema profiles, we introduce Datavore| a tool oriented towards metadata designers that provides rankedlists of vocabulary terms to reuse in data modeling process, together with additional metadata and cross-terms relations. The tool relies on the Linked Open Vocabulary (LOV) ecosystem for acquiring vocabularies and metadata and is made available for the community
Calmant, Stéphane. "Etude du comportement rhéologique de la lithosphère océanique basée sur les données spaciales." Toulouse 3, 1987. http://www.theses.fr/1987TOU30169.
Full textAlili, Hiba. "Intégration de données basée sur la qualité pour l'enrichissement des sources de données locales dans le Service Lake." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLED019.
Full textIn the Big Data era, companies are moving away from traditional data-warehouse solutions whereby expensive and timeconsumingETL (Extract, Transform, Load) processes are used, towards data lakes in order to manage their increasinglygrowing data. Yet the stored knowledge in companies’ databases, even though in the constructed data lakes, can never becomplete and up-to-date, because of the continuous production of data. Local data sources often need to be augmentedand enriched with information coming from external data sources. Unfortunately, the data enrichment process is one of themanual labors undertaken by experts who enrich data by adding information based on their expertise or select relevantdata sources to complete missing information. Such work can be tedious, expensive and time-consuming, making itvery promising for automation. We present in this work an active user-centric data integration approach to automaticallyenrich local data sources, in which the missing information is leveraged on the fly from web sources using data services.Accordingly, our approach enables users to query for information about concepts that are not defined in the data sourceschema. In doing so, we take into consideration a set of user preferences such as the cost threshold and the responsetime necessary to compute the desired answers, while ensuring a good quality of the obtained results
Simon, Franck. "Découverte causale sur des jeux de données classiques et temporels. Application à des modèles biologiques." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS528.
Full textThis thesis focuses on the field of causal discovery : the construction of causal graphs from observational data, and in particular, temporal causal discovery and the reconstruction of large gene regulatory networks. After a brief history, this thesis introduces the main concepts, hypotheses and theorems underlying causal graphs as well as the two main approaches: score-based and constraint-based methods. The MIIC (Multivariate Information-based Inductive Causation) method, developed in our laboratory, is then described with its latest improvements: Interpretable MIIC. The issues and solutions implemented to construct a temporal version (tMIIC) are presented as well as benchmarks reflecting the advantages of tMIIC compared to other state-of-the-art methods. The application to sequences of images taken with a microscope of a tumor environment reconstituted on microchips illustrates the capabilities of tMIIC to recover, solely from data, known and new relationships. Finally, this thesis introduces the use of a consequence a priori to apply causal discovery to the reconstruction of gene regulatory networks. By assuming that all genes, except transcription factors, are only consequence genes, it becomes possible to reconstruct graphs with thousands of genes. The ability to identify key transcription factors de novo is illustrated by an application to single cell RNA sequencing data with the discovery of two transcription factors likely to be involved in the biological process of interest
Verney, Philippe. "Interprétation géologique de données sismiques par une méthode supervisée basée sur la vision cognitive." Phd thesis, École Nationale Supérieure des Mines de Paris, 2009. http://pastel.archives-ouvertes.fr/pastel-00005861.
Full textNguyen, Thi Dieu Thu. "Une approche basée sur LD pour l'interrogation de données relationnelles dans le Web sémantique." Nice, 2008. http://www.theses.fr/2008NICE4007.
Full textThe Semantic Web is a new Web paradigm that provides a common framework for data to be shared and reused across applications, enterprises and community boundaries. The biggest problem we face right now is a way to ``link'' information coming from different sources that are often heterogeneous syntactically as well as semantically. Today much information is stored in relational databases. Thus data integration from relational sources into the Semantic Web is in high demand. The objective of this thesis is to provide methods and techniques to address this problem. It proposes an approach based on a combination of ontology-based schema representation and description logics. Database schemas in the approach are designed using ORM methodology. The stability and flexibility of ORM facilitate the maintenance and evolution of integration systems. A new web ontology language and its logic foundation are proposed in order to capture the semantics of relational data sources while still assuring a decidable and automated reasoning over information from the sources. An automatic translation of ORM models into ontologies is introduced to allow capturing the data semantics without laboriousness and fallibility. This mechanism foresees the coexistence of others sources, such as hypertext, integrated into the Semantic Web environment. This thesis constitutes the advances in many fields, namely data integration, ontology engineering, description logics, and conceptual modeling. It is hoped to provide a foundation for further investigations of data integration from relational sources into the Semantic Web
Qiu, Han. "Une architecture de protection des données efficace basée sur la fragmentation et le cryptage." Electronic Thesis or Diss., Paris, ENST, 2017. http://www.theses.fr/2017ENST0049.
Full textIn this thesis, a completely revisited data protection scheme based on selective encryption is presented. First, this new scheme is agnostic in term of data format, second it has a parallel architecture using GPGPU allowing performance to be at least comparable to full encryption algorithms. Bitmap, as a special uncompressed multimedia format, is addressed as a first use case. Discrete Cosine Transform (DCT) is the first transformation for splitting fragments, getting data protection, and storing data separately on local device and cloud servers. This work has largely improved the previous published ones for bitmap protection by providing new designs and practical experimentations. General purpose graphic processing unit (GPGPU) is exploited as an accelerator to guarantee the efficiency of the calculation compared with traditional full encryption algorithms. Then, an agnostic selective encryption based on lossless Discrete Wavelet Transform (DWT) is presented. This design, with practical experimentations on different hardware configurations, provides strong level of protection and good performance at the same time plus flexible storage dispersion schemes. Therefore, our agnostic data protection and transmission solution combining fragmentation, encryption, and dispersion is made available for a wide range of end-user applications. Also a complete set of security analysis are deployed to test the level of provided protection
Al-Najdi, Atheer. "Une approche basée sur les motifs fermés pour résoudre le problème de clustering par consensus." Thesis, Université Côte d'Azur (ComUE), 2016. http://www.theses.fr/2016AZUR4111/document.
Full textClustering is the process of partitioning a dataset into groups, so that the instances in the same group are more similar to each other than to instances in any other group. Many clustering algorithms were proposed, but none of them proved to provide good quality partition in all situations. Consensus clustering aims to enhance the clustering process by combining different partitions obtained from different algorithms to yield a better quality consensus solution. In this work, a new consensus clustering method, called MultiCons, is proposed. It uses the frequent closed itemset mining technique in order to discover the similarities between the different base clustering solutions. The identified similarities are presented in a form of clustering patterns, that each defines the agreement between a set of base clusters in grouping a set of instances. By dividing these patterns into groups based on the number of base clusters that define the pattern, MultiCons generates a consensussolution from each group, resulting in having multiple consensus candidates. These different solutions are presented in a tree-like structure, called ConsTree, that facilitates understanding the process of building the multiple consensuses, and also the relationships between the data instances and their structuring in the data space. Five consensus functions are proposed in this work in order to build a consensus solution from the clustering patterns. Approach 1 is to just merge any intersecting clustering patterns. Approach 2 can either merge or split intersecting patterns based on a proposed measure, called intersection ratio
Bouker, Slim. "Contribution à l'extraction des règles d'association basée sur des préférences." Thesis, Clermont-Ferrand 2, 2015. http://www.theses.fr/2015CLF22585/document.
Full textTa, Minh Thuy. "Techniques d'optimisation non convexe basée sur la programmation DC et DCA et méthodes évolutives pour la classification non supervisée." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0099/document.
Full textThis thesis focus on four problems in data mining and machine learning: clustering data streams, clustering massive data sets, weighted hard and fuzzy clustering and finally the clustering without a prior knowledge of the clusters number. Our methods are based on deterministic optimization approaches, namely the DC (Difference of Convex functions) programming and DCA (Difference of Convex Algorithm) for solving some classes of clustering problems cited before. Our methods are also, based on elitist evolutionary approaches. We adapt the clustering algorithm DCA–MSSC to deal with data streams using two windows models: sub–windows and sliding windows. For the problem of clustering massive data sets, we propose to use the DCA algorithm with two phases. In the first phase, massive data is divided into several subsets, on which the algorithm DCA–MSSC performs clustering. In the second phase, we propose a DCA–Weight algorithm to perform a weighted clustering on the obtained centers in the first phase. For the weighted clustering, we also propose two approaches: weighted hard clustering and weighted fuzzy clustering. We test our approach on image segmentation application. The final issue addressed in this thesis is the clustering without a prior knowledge of the clusters number. We propose an elitist evolutionary approach, where we apply several evolutionary algorithms (EAs) at the same time, to find the optimal combination of initial clusters seed and in the same time the optimal clusters number. The various tests performed on several sets of large data are very promising and demonstrate the effectiveness of the proposed approaches
Couchot, Alain. "Analyse statique de la terminaison des règles actives basée sur la notion de chemin maximal." Paris 12, 2001. http://www.theses.fr/2001PA120042.
Full textThe active rules are intended to enrich the databases with a reactive behaviour. An active rule is composed of three main components: the event, the condition, the action. It is desired to guarantee a priori the termination of a set of active rules. The aim of this thesis is to increase the number of termination situations detected by the static analysis. We first determine some restrictions of the previous static analysis methods. We develop then an algorithm for static analysis of termination based on the notion of maximal path of a node. The notion of maximal path is intended to replace the notion of cycle, used by the previous termination algorithms. We present some applications and extensions of our termination algorithm. These extensions and applications concern the active rules flot included in a cycle, the composite conditions, the composite events, the priorities between ailes, and the modular design of rules. .
Chaari, Anis. "Nouvelle approche d'identification dans les bases de données biométriques basée sur une classification non supervisée." Phd thesis, Université d'Evry-Val d'Essonne, 2009. http://tel.archives-ouvertes.fr/tel-00549395.
Full textAoun-Allah, Mohamed. "Le forage distribué des données : une approche basée sur l'agrégation et le raffinement de modèles." Thesis, Université Laval, 2006. http://www.theses.ulaval.ca/2006/23393/23393.pdf.
Full textWith the pervasive use of computers in all spheres of activity in our society, we are faced nowadays with the explosion of electronic data. This is why we need automatic tools that are able to automatically analyze the data in order to provide us with relevant and summarized information with respect to some query. For this task, data mining techniques are generally used. However, these techniques require considerable computing time in order to analyze a huge volume of data. Moreover, if the data is geographically distributed, gathering it on the same site in order to create a model (a classifier for instance) could be time consuming. To solve this problem, we propose to build several models, that is one classifier by site. Then, rules constituting these classifiers are aggregated and filtered based on some statistical measures, and a validation process is carried out on samples from each site. The resulting model, called a metaclassifier is, on one hand, a prediction tool for any new (unseen) instance and, on the other hand, an abstract view of the whole data set. We base our rule filtering approach on a confidence measure associated with each rule, which is computed statistically and then validated using the data samples (one from each site). We considered several validation techniques such as will be discussed in this thesis.
Boudoin, Pierre. "L'interaction 3D adaptative : une approche basée sur les méthodes de traitement de données multi-capteurs." Phd thesis, Université d'Evry-Val d'Essonne, 2010. http://tel.archives-ouvertes.fr/tel-00553369.
Full textMarteau, Hubert. "Une méthode d'analyse de données textuelles pour les sciences sociales basée sur l'évolution des textes." Tours, 2005. http://www.theses.fr/2005TOUR4028.
Full textThis PhD Thesis aims at bringing to sociologists a data-processing tool wich allows them to analyse of semi-directing open talks. The proposed tool performs in two steps : an indexation of the talks followed by a classification. Usually, indexing methods rely on a general stastistical analysis. Such methods are suited for texts having contents and structure ( literary texts, scientific texts,. . . ). These texts have more vocabulary and structure than talks (limitation to 1000 words for suche texts). On the basis of the assumption that the sociological membership strongly induces the form of the speech, we propose various methods to evaluate the structure and the evolution of the texts. The methods attempt to find new representations of texts (image, signal) and to extract values from these new representations. Selected classification is a classification by trees (NJ). It has a low complexity and it respects distances, then this method is a good solution to provide a help to classification
Lo, Céline. "Fermeture de la turbulence au second-ordre proche paroi basée sur l'analyse de données DNS." Paris 6, 2011. http://www.theses.fr/2011PA066632.
Full textZendjebil, Iman mayssa. "Localisation 3D basée sur une approche de suppléance multi-capteurs pour la réalité augmentée mobile en milieu extérieur." Thesis, Evry-Val d'Essonne, 2010. http://www.theses.fr/2010EVRY0024/document.
Full textThe democratization of mobile devices such as smartphones, PDAs or tablet-PCs makes it possible to use Augmented Reality systems in large scale environments. However, in order to implement such systems, many issues must be adressed. Among them, 3D localization is one of the most important. Indeed, the estimation of the position and orientation (also called pose) of the viewpoint (of the camera or the user) allows to register the virtual objects over the visible part of the real world. In this paper, we present an original localization system for large scale environments which uses a markerless vision-based approach to estimate the camera pose. It relies on natural feature points extracted from images. Since this type of method is sensitive to brightness changes, occlusions and sudden motion which are likely to occur in outdoor environment, we use two more sensors to assist the vision process. In our work, we would like to demonstrate the feasibility of an assistance scheme in large scale outdoor environment. The intent is to provide a fallback system for the vision in case of failure as well as to reinitialize the vision system when needed. The complete localization system aims to be autonomous and adaptable to different situations. We present here an overview of our system, its performance and some results obtained from experiments performed in an outdoor environment under real conditions
Ta, Minh Thuy. "Techniques d'optimisation non convexe basée sur la programmation DC et DCA et méthodes évolutives pour la classification non supervisée." Electronic Thesis or Diss., Université de Lorraine, 2014. http://www.theses.fr/2014LORR0099.
Full textThis thesis focus on four problems in data mining and machine learning: clustering data streams, clustering massive data sets, weighted hard and fuzzy clustering and finally the clustering without a prior knowledge of the clusters number. Our methods are based on deterministic optimization approaches, namely the DC (Difference of Convex functions) programming and DCA (Difference of Convex Algorithm) for solving some classes of clustering problems cited before. Our methods are also, based on elitist evolutionary approaches. We adapt the clustering algorithm DCA–MSSC to deal with data streams using two windows models: sub–windows and sliding windows. For the problem of clustering massive data sets, we propose to use the DCA algorithm with two phases. In the first phase, massive data is divided into several subsets, on which the algorithm DCA–MSSC performs clustering. In the second phase, we propose a DCA–Weight algorithm to perform a weighted clustering on the obtained centers in the first phase. For the weighted clustering, we also propose two approaches: weighted hard clustering and weighted fuzzy clustering. We test our approach on image segmentation application. The final issue addressed in this thesis is the clustering without a prior knowledge of the clusters number. We propose an elitist evolutionary approach, where we apply several evolutionary algorithms (EAs) at the same time, to find the optimal combination of initial clusters seed and in the same time the optimal clusters number. The various tests performed on several sets of large data are very promising and demonstrate the effectiveness of the proposed approaches
Maiz, Nora. "Intégration de données par médiation basée sur les ontologies pour l'analyse en ligne (OLAP) à la demande." Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO20050.
Full textCurrent decisional systems are modelled according to a multidimensional model which, isdedicated to on-line analysis. Their principal limitations lie in their structure, their volume andthat they do not take into account data sources and analysis needs evolution. In this thesis, wepropose a dynamic architecture for on-line analysis on-the-fly which is different fromwarehousing data in a target base with a fixed model.In our architecture, data can continue to evolve in their sources according to the activity thatthey describe. Collecting and structuring data in analysis contexts is when we want to makeanalysis.To implement this architecture, we consider a solution composed of two main parts:- The construction of a data integration system by mediation based on ontologies.- The implementation of a dispositive to building analysis contexts on-the-fly which isbased on ontologies to describe the decisional domain
Shahzad, Atif. "Une Approche Hybride de Simulation-Optimisation Basée sur la fouille de Données pour les problèmes d'ordonnancement." Phd thesis, Université de Nantes, 2011. http://tel.archives-ouvertes.fr/tel-00647353.
Full textMerino, Laso Pedro. "Détection de dysfonctionements et d'actes malveillants basée sur des modèles de qualité de données multi-capteurs." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0056/document.
Full textNaval systems represent a strategic infrastructure for international commerce and military activity. Their protection is thus an issue of major importance. Naval systems are increasingly computerized in order to perform an optimal and secure navigation. To attain this objective, on board vessel sensor systems provide navigation information to be monitored and controlled from distant computers. Because of their importance and computerization, naval systems have become a target for hackers. Maritime vessels also work in a harsh and uncertain operational environments that produce failures. Navigation decision-making based on wrongly understood anomalies can be potentially catastrophic.Due to the particular characteristics of naval systems, the existing detection methodologies can't be applied. We propose quality evaluation and analysis as an alternative. The novelty of quality applications on cyber-physical systems shows the need for a general methodology, which is conceived and examined in this dissertation, to evaluate the quality of generated data streams. Identified quality elements allow introducing an original approach to detect malicious acts and failures. It consists of two processing stages: first an evaluation of quality; followed by the determination of agreement limits, compliant with normal states to identify and categorize anomalies. The study cases of 13 scenarios for a simulator training platform of fuel tanks and 11 scenarios for two aerial drones illustrate the interest and relevance of the obtained results
Shahzad, Muhammad Atif. "Une approche hybride de simulation-optimisation basée sur la fouille de données pour les problèmes d'ordonnancement." Nantes, 2011. http://archive.bu.univ-nantes.fr/pollux/show.action?id=53c8638a-977a-4b85-8c12-6dc88d92f372.
Full textA data mining based approach to discover previously unknown priority dispatching rules for job shop scheduling problem is presented. This approach is based upon seeking the knowledge that is assumed to be embedded in the efficient solutions provided by the optimization module built using tabu search. The objective is to discover the scheduling concepts using data mining and hence to obtain a set of rules capable of approximating the efficient solutions for a job shop scheduling problem (JSSP). A data mining based scheduling framework is presented and implemented for a job shop problem with maximum lateness and mean tardiness as the scheduling objectives. The results obtained are very promising
Sidibe, Ibrahima dit Bouran. "Analyse non-paramétrique des politiques de maintenance basée sur des données des durées de vie hétérogènes." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0081/document.
Full textIn the reliability literature, several researches works have been developed to deal with modeling, analysis and implementation of maintenance policies for equipments subject to random failures. The majority of these works are based on common assumptions among which the distribution function of the equipment lifetimes is assumed to be known. Furthermore, the equipment is assumed to experience only one operating environment. Such assumptions are indeed restrictive and may introduce a bias in the statistical analysis of the distribution function of the equipment lifetimes which in turn impacts optimization of maintenance policies. In the present research work, these two particular assumptions are relaxed. This relaxation allows to take into account of information related to conditions where the equipment is being operating and to focus on the statistical analysis of maintenance policies without using an intermediate parametric lifetimes distribution. The objective of this thesis consists then on the development of efficient statistical models and tools for managing the maintenance of equipments whose lifetimes distribution is unknown and defined through the heterogeneous lifetimes data. Indeed, this thesis proposes a framework for maintenance strategies determination, from lifetimes data acquisition toward the computation of optimal maintenance policies. The maintenance policies considered are assumed to be performed on used equipments. These later are conduct to experience their missions within different environments each of which is characterized by a degree of severity. In this context, a first mathematical model is proposed to evaluate costs induced by maintenance strategies. The analysis of these costs helps to establish the necessary and sufficient conditions to ensure the existence of an optimal age to perform the preventive maintenance. The maintenance costs are fully estimated by using the Kernel method. This estimation method is non-parametric and defined by two parameters, namely the kernel function and the smoothing parameter. The variability of maintenance costs estimator is deeply analyzed according to the smoothing parameter of Kernel method. From these analyses, it is shown that Kernel estimator method ensures a weak propagation of the errors due to the computation of smoothing parameter. In addition, several simulations are made to estimate the optimal replacement age. These simulations figure out that the numerical results from the Kernel method are close to the theoretical values with a weak coefficient of variation. Two probabilistic extensions of the first mathematical model are proposed and theoretically discussed. To deal with the problem of delayed preventive maintenance, an approach is proposed and discussed. The proposed approach allows evaluating the risk that could induce the delay taken to perform a preventive maintenance at the required optimal date. This approach is based on risk analysis conduct on the basis of a proposed risk function
Sidibe, Ibrahima dit Bouran. "Analyse non-paramétrique des politiques de maintenance basée sur des données des durées de vie hétérogènes." Electronic Thesis or Diss., Université de Lorraine, 2014. http://www.theses.fr/2014LORR0081.
Full textIn the reliability literature, several researches works have been developed to deal with modeling, analysis and implementation of maintenance policies for equipments subject to random failures. The majority of these works are based on common assumptions among which the distribution function of the equipment lifetimes is assumed to be known. Furthermore, the equipment is assumed to experience only one operating environment. Such assumptions are indeed restrictive and may introduce a bias in the statistical analysis of the distribution function of the equipment lifetimes which in turn impacts optimization of maintenance policies. In the present research work, these two particular assumptions are relaxed. This relaxation allows to take into account of information related to conditions where the equipment is being operating and to focus on the statistical analysis of maintenance policies without using an intermediate parametric lifetimes distribution. The objective of this thesis consists then on the development of efficient statistical models and tools for managing the maintenance of equipments whose lifetimes distribution is unknown and defined through the heterogeneous lifetimes data. Indeed, this thesis proposes a framework for maintenance strategies determination, from lifetimes data acquisition toward the computation of optimal maintenance policies. The maintenance policies considered are assumed to be performed on used equipments. These later are conduct to experience their missions within different environments each of which is characterized by a degree of severity. In this context, a first mathematical model is proposed to evaluate costs induced by maintenance strategies. The analysis of these costs helps to establish the necessary and sufficient conditions to ensure the existence of an optimal age to perform the preventive maintenance. The maintenance costs are fully estimated by using the Kernel method. This estimation method is non-parametric and defined by two parameters, namely the kernel function and the smoothing parameter. The variability of maintenance costs estimator is deeply analyzed according to the smoothing parameter of Kernel method. From these analyses, it is shown that Kernel estimator method ensures a weak propagation of the errors due to the computation of smoothing parameter. In addition, several simulations are made to estimate the optimal replacement age. These simulations figure out that the numerical results from the Kernel method are close to the theoretical values with a weak coefficient of variation. Two probabilistic extensions of the first mathematical model are proposed and theoretically discussed. To deal with the problem of delayed preventive maintenance, an approach is proposed and discussed. The proposed approach allows evaluating the risk that could induce the delay taken to perform a preventive maintenance at the required optimal date. This approach is based on risk analysis conduct on the basis of a proposed risk function
Thonet, Thibaut. "Modèles thématiques pour la découverte non supervisée de points de vue sur le Web." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30167/document.
Full textThe advent of online platforms such as weblogs and social networking sites provided Internet users with an unprecedented means to express their opinions on a wide range of topics, including policy and commercial products. This large volume of opinionated data can be explored and exploited through text mining techniques known as opinion mining or sentiment analysis. Contrarily to traditional opinion mining work which mostly focuses on positive and negative opinions (or an intermediate in-between), we study a more challenging type of opinions: viewpoints. Viewpoint mining reaches beyond polarity-based opinions (positive/negative) and enables the analysis of more subtle opinions such as political opinions. In this thesis, we proposed unsupervised approaches – i.e., approaches which do not require any labeled data – based on probabilistic topic models to jointly discover topics and viewpoints expressed in opinionated data. In our first contribution, we explored the idea of separating opinion words (specific to both viewpoints and topics) from topical, neutral words based on parts of speech, inspired by similar practices in the litterature of non viewpoint-related opinion mining. Our second contribution tackles viewpoints expressed by social network users. We aimed to study to what extent social interactions between users – in addition to text content – can be beneficial to identify users' viewpoints. Our different contributions were evaluated and benchmarked against state-of-the-art baselines on real-world datasets
Auclair, Beaudry Jean-Sébastien. "Modelage de contexte simplifié pour la compression basée sur la transformée en cosinus discrète." Mémoire, Université de Sherbrooke, 2009. http://savoirs.usherbrooke.ca/handle/11143/1511.
Full textJaillet, Simon. "Catégorisation automatique de documents textuels : D'une représentation basée sur les concepts aux motifs séquentiels." Montpellier 2, 2005. http://www.theses.fr/2005MON20030.
Full textNguyen, Thu Thi Dieu. "Une approche basée sur la logique de description pour l'intégration de données relationnelles dans le web sémantique." Phd thesis, Université de Nice Sophia-Antipolis, 2008. http://tel.archives-ouvertes.fr/tel-00507482.
Full textL'objectif de cette thèse est de fournir des méthodes et des techniques pour résoudre ce problème d'intégration des bases de données. Nous proposons une approche combinant des représentations de schémas à base d'ontologie et des logiques de descriptions. Les schémas de base de données sont conçus en utilisant la méthodologie ORM. La stabilité et la flexibilité de ORM facilite la maintenance et l'évolution des systèmes d'intégration. Un nouveau langage d'ontologie web et ses fondements logiques sont proposées afin de capturer la sémantique des sources de données relationnelles, tout en assurant le raisonnement décidable et automatique sur les informations provenant des sources. Une traduction automatisée des modèles ORM en ontologies est introduite pour permettre d'extraire la sémantique des données rapidement et sans faillibilité. Ce mécanisme prévoit la coexistence d'autre sources d'informations, tel que l'hypertexte, intégrées à l'environnement web sémantique.
Cette thèse constitue une avancée dans un certain nombre de domaine, notamment dans l'intégration de données, l'ingénierie des ontologies, les logiques de descriptions, et la modélisation conceptuelle. Ce travail pourra fournir les fondations pour d'autres investigations pour intégrer les données provenant de sources relationnelles vers le web sémantique.
Christine, Heritier-Pingeon. "Une aide à la conception de systèmes de production basée sur la simulation et l'analyse de données." Phd thesis, INSA de Lyon, 1991. http://tel.archives-ouvertes.fr/tel-00840151.
Full textXu, Hao. "Estimation statistique d'atlas probabiliste avec les données multimodales et son application à la segmentation basée sur l'atlas." Phd thesis, Ecole Polytechnique X, 2014. http://pastel.archives-ouvertes.fr/pastel-00969176.
Full textHeritier-Pingeon, Christine. "Une aide à la conception de systèmes de production basée sur la simulation et l'analyse de données." Lyon, INSA, 1991. http://tel.archives-ouvertes.fr/docs/00/84/01/51/PDF/1991_Heritier-Pingeon_Christine.pdf.
Full textNew forms of competition are leading manufacturing systems to more and more flexibility. In the case of highly automated systems, decisions taken in the design phase will have a great influence on the possibilities of the future system and also on its ease of adaptation to changes, and thus on its degree of flexibility. This work is a study of methods and tools for decision support in the design of manufacturing systems. The reader is first introduced to the scope and then to the tools and methods employed. The workshop 's model which is used as a support for the approach is then presented and the construction of a simulation plan considered These considerations are then put into a concrete form by defining an automated generation module for simulation plans which are associated to the chosen workshop model. Data analysis which is used as a knowledge acquisition method is considered a method of analysis is proposed and tested. This work was developed to explore data analysis possibilities in this field and to evaluate these possibilities on the base of numerous experiments
Vo, Nguyen Dang Khoa. "Compression vidéo basée sur l'exploitation d'un décodeur intelligent." Thesis, Nice, 2015. http://www.theses.fr/2015NICE4136/document.
Full textThis Ph.D. thesis studies the novel concept of Smart Decoder (SDec) where the decoder is given the ability to simulate the encoder and is able to conduct the R-D competition similarly as in the encoder. The proposed technique aims to reduce the signaling of competing coding modes and parameters. The general SDec coding scheme and several practical applications are proposed, followed by a long-term approach exploiting machine learning concept in video coding. The SDec coding scheme exploits a complex decoder able to reproduce the choice of the encoder based on causal references, eliminating thus the need to signal coding modes and associated parameters. Several practical applications of the general outline of the SDec scheme are tested, using different coding modes during the competition on the reference blocs. Despite the choice for the SDec reference block being still simple and limited, interesting gains are observed. The long-term research presents an innovative method that further makes use of the processing capacity of the decoder. Machine learning techniques are exploited in video coding with the purpose of reducing the signaling overhead. Practical applications are given, using a classifier based on support vector machine to predict coding modes of a block. The block classification uses causal descriptors which consist of different types of histograms. Significant bit rate savings are obtained, which confirms the potential of the approach
Sellami, Akrem. "Interprétation sémantique d'images hyperspectrales basée sur la réduction adaptative de dimensionnalité." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0037/document.
Full textHyperspectral imagery allows to acquire a rich spectral information of a scene in several hundred or even thousands of narrow and contiguous spectral bands. However, with the high number of spectral bands, the strong inter-bands spectral correlation and the redundancy of spectro-spatial information, the interpretation of these massive hyperspectral data is one of the major challenges for the remote sensing scientific community. In this context, the major challenge is to reduce the number of unnecessary spectral bands, that is, to reduce the redundancy and high correlation of spectral bands while preserving the relevant information. Therefore, projection approaches aim to transform the hyperspectral data into a reduced subspace by combining all original spectral bands. In addition, band selection approaches attempt to find a subset of relevant spectral bands. In this thesis, firstly we focus on hyperspectral images classification attempting to integrate the spectro-spatial information into dimension reduction in order to improve the classification performance and to overcome the loss of spatial information in projection approaches.Therefore, we propose a hybrid model to preserve the spectro-spatial information exploiting the tensor model in the locality preserving projection approach (TLPP) and to use the constraint band selection (CBS) as unsupervised approach to select the discriminant spectral bands. To model the uncertainty and imperfection of these reduction approaches and classifiers, we propose an evidential approach based on the Dempster-Shafer Theory (DST). In the second step, we try to extend the hybrid model by exploiting the semantic knowledge extracted through the features obtained by the previously proposed approach TLPP to enrich the CBS technique. Indeed, the proposed approach makes it possible to select a relevant spectral bands which are at the same time informative, discriminant, distinctive and not very redundant. In fact, this approach selects the discriminant and distinctive spectral bands using the CBS technique injecting the extracted rules obtained with knowledge extraction techniques to automatically and adaptively select the optimal subset of relevant spectral bands. The performance of our approach is evaluated using several real hyperspectral data
Teguiak, Henry Valery. "Construction d'ontologies à partir de textes : une approche basée sur les transformations de modèles." Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2012. http://tel.archives-ouvertes.fr/docs/00/78/62/60/PDF/ISAE-ENSMA_2012-12-12_Thesis_TEGUIAK.pdf.
Full textSince its emergence in the early 1990s, the notion of ontology has been quickly distributed in many areas of research. Given the promise of this concept, many studies focus on the use of ontologies in many areas like information retrieval, electronic commerce, semantic Web, data integration, etc. . The effectiveness of all this work is based on the assumption of the existence of a domain ontology that is already built an that can be used. However, the design of such ontology is particularly difficult if you want it to be built in a consensual way. If there are tools for editing ontologies that are supposed to be already designed, and if there are also several platforms for natural language processing able to automatically analyze corpus of texts and annotate them syntactically and statistically, it is difficult to find a globally accepted procedure useful to develop a domain ontology in a progressive, explicit and traceable manner using a set of information resources within this area. The goal of ANR DaFOE4App (Differential and Formal Ontology Editor for Application) project, within which our work belongs to, was to promote the emergence of such a set of tools. Unlike other tools for ontologies development, the platform DaFOE presented in this thesis does not propose a methodology based on a fixed number of steps with a fixed representation of theses steps. Indeed, in this thesis we generalize the process of ontologies development for any number of steps. The interest of such a generalization is, for example, to offer the possibility to refine the development process by inserting or modifying steps. We may also wish to remove some steps in order to simplify the development process. The aim of this generalization is for instance, for the overall process of ontologies development, to minimize the impact of adding, deleting, or modifying a step while maintaining the overall consistency of the development process. To achieve this, our approach is to use Model Driven Engineering to characterize each step through a model and then reduce the problem of switching from one step to another to a problem of models transformation. Established mappings between models are then used to semi-automate the process of ontologies development. As all this process is stored in a database, we propose in this thesis, for Model Based Database (MBDB) because they can store both data and models describing these data, an extension for handling mappings. We also propose the query language named MQL (Mapping Query Language) in order to hide the complexity of the MBDB structure. The originality of the MQL language lies in its ability, through queries syntactically compact, to explore the graph of mappings using the transitivity property of mappings when retrieving informations
Izza, Saïd. "Intégration des systèmes d'information industriels : une approche flexible basée sur les services sémantiques." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. http://tel.archives-ouvertes.fr/tel-00780240.
Full textWang, Zhiqiang. "Aide à la décision en usinage basée sur des règles métier et apprentissages non supervisés." Thesis, Nantes, 2020. http://www.theses.fr/2020NANT4038.
Full textIn the general context of Industry 4.0, large volumes of manufacturing data are available on instrumented machine-tools. They are interesting to exploit not only to improve machine-tool performances but also to support the decision making for the operational management. This thesis aims at proposing a decision-aid system for intelligent and connected machine-tools through Data mining. The first step in a data mining approach is the selection of relevant data. Raw data must, therefore, be classified into different groups of contexts. This thesis proposes a contextual classification procedure in machining based on unsupervised machine learning by Gaussian mixture model. Based on this contextual classification information, different machining incidents can be detected in real-time. They include chatter, tool breakage and excessive vibration. This thesis introduces a set of business rules for incidents detection. The operational context has been deciphering when incidents occur, based on the contextual classification that explains the types of machining and tool engagement. Then, the nouveaux relevant and appropriate Key Performance Indicators (KPIs) can be proposed based on these contextual information and the incidents detected to support decision making for the operational management
Georgescu, Vera. "Classification de données multivariées multitypes basée sur des modèles de mélange : application à l'étude d'assemblages d'espèces en écologie." Phd thesis, Université d'Avignon, 2010. http://tel.archives-ouvertes.fr/tel-00624382.
Full text