Dissertations / Theses on the topic 'Données de santé hétérogènes'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Données de santé hétérogènes.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Lelong, Romain. "Accès sémantique aux données massives et hétérogènes en santé." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR030/document.
Full textClinical data are produced as part of the practice of medicine by different health professionals, in several places and in various formats. They therefore present an heterogeneity both in terms of their nature and structure and are furthermore of a particularly large volume, which make them considered as Big Data. The work carried out in this thesis aims at proposing an effective information retrieval method within the context of this type of complex and massive data. First, the access to clinical data constrained by the need to model clinical information. This can be done within Electronic Health Records and, in a larger extent, within data Warehouses. In this thesis, I proposed a proof of concept of a search engine allowing the access to the information contained in the Semantic Health Data Warehouse of the Rouen University Hospital. A generic data model allows this data warehouse to view information as a graph of data, thus enabling to model the information while preserving its conceptual complexity. In order to provide search functionalities adapted to this generic representation of data, a query language allowing access to clinical information through the various entities of which it is composed has been developed and implemented as a part of this thesis’s work. Second, the massiveness of clinical data is also a major technical challenge that hinders the implementation of an efficient information retrieval. The initial implementation of the proof of concept highlighted the limits of a relational database management systems when used in the context of clinical data. A migration to a NoSQL key-value store has been then completed. Although offering good atomic data access performance, this migration nevertheless required additional developments and the design of a suitable hardware and applicative architecture toprovide advanced search functionalities. Finally, the contribution of this work within the general context of the Semantic Health Data Warehouse of the Rouen University Hospital was evaluated. The proof of concept proposed in this work was used to access semantic descriptions of information in order to meet the criteria for including and excluding patients in clinical studies. In this evaluation, a total or partial response is given to 72.97% of the criteria. In addition, the genericity of the tool has also made it possible to use it in other contexts such as documentary and bibliographic information retrieval in health
Griffier, Romain. "Intégration et utilisation secondaire des données de santé hospitalières hétérogènes : des usages locaux à l'analyse fédérée." Electronic Thesis or Diss., Bordeaux, 2024. http://www.theses.fr/2024BORD0479.
Full textHealthcare data can be used for purposes other than those for which it was initially collected: this is the secondary use of health data. In the hospital context, to overcome the obstacles to secondary use of healthcaree data (data and organizational barriers), a classic strategy is to set up Clinical Data Warehouses (CDWs). This thesis describes three contributions to the Bordeaux University Hospital’s CDW. Firstly, an instance-based, privacy-preserving, method for mapping numerical biology data elements is presented, with an F-measure of 0,850, making it possible to reduce the semantic heterogeneity of data. Next, an adaptation of the i2b2 clinical data integration model is proposed to enable CDW data persistence in a NoSQL database, Elasticsearch. This implementation has been evaluated on the Bordeaux University Hospital’s CDW, showing improved performance in terms of storage and query time, compared with a relational database. Finally, the Bordeaux University Hospital’s CDW environment is presented, with the description of a first CDW dedicated to local uses that can be used autonomously by end users (i2b2), and a second CDW dedicated to federated networks (OMOP) enabling participation in the DARWIN-EU federated network
Pauly, Vanessa. "Evaluation de l'abus et du détournement des médicaments psychoactifs en addictovigilance : analyse de bases de données hétérogènes." Thesis, Aix-Marseille 2, 2011. http://www.theses.fr/2011AIX20696.
Full textThe objective of this work was to analyze abuse, dependence and diversion of psychoactive medicines in real settings using jointly different indicators issued from mixed datasources in order to present a synthetic vision. The datasources used in this work are issued from the tools developed by the Centres for Evaluation and Information on Pharmacodependency (CEIP). They allow to measure directly drug abuse with specific populations of dependent patients or under opiate treatment (OPPIDUM (Observation of the Illicit Psychotropic Products or Diverted from their Medicinal Use) survey)). These tools also allow to measure the diversion via the measure of the phenomenon of “doctor shopping” (overlapping of prescriptions) and the measure of the number of patients presenting a deviant behaviour from general health insurance databases; then they measure diversion through falsified prescriptions presented at pharmacies (the OSIAP (Forged prescriptions indicating potential abuse) survey).This multisources approach has been firstly applied to analyse abuse and diversion of clonazepam (1st publication). This study has highlighted the emerging problem of diversion of clonazepam, after flunitrazepam and has also illustrated the difficulty of analysing with consistency the information gathered by these different datasources. A good system for controlling drug diversion and abuse has to allow analysing trends. We have so proposed a classification method aiming at revealing profile of subjects with deviant behaviour to use it on an evolutive manner so as to study diversion of methylphenidate on a four year period (2nd publication). This classification method has then been applied jointly with a method measuring the “doctor shopping” to analyse diversion of High Dosage Buprenorphine (HDB) (3rd publication). This study has revealed an important problem of diversion of HDB, has also demonstrated that the two methods were globally concordant and has allowed to evaluate their advantages for the controlling of the abuse and diversion of prescription drugs. These two last methods have then been analysed jointly with data from the OPPIDUM and OSIAP surveys to allow to study and compare diversion of benzodiazepine drugs (4th publication) and opioids drugs (5th publication). This multisource approach allows to limit biases linked to each method seen individually. Our work points out the relevance of such a multisources system to estimate the abuse of a prescription drug and to compare it with the other substances. Nevertheless, the development of such a system applied to the domain of the drug dependency is relatively new, and requires improvements concerning the integration of the other sources of data and the methodology used to join and synthetize the information obtained. Finally, such a system "multi-sources” has the potential to exist and to make a real contribution to the domain of the drug dependency in France
Michel, Franck. "Intégrer des sources de données hétérogènes dans le Web de données." Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4002/document.
Full textTo a great extent, the success of the Web of Data depends on the ability to reach out legacy data locked in silos inaccessible from the web. In the last 15 years, various works have tackled the problem of exposing various structured data in the Resource Description Format (RDF). Meanwhile, the overwhelming success of NoSQL databases has made the database landscape more diverse than ever. NoSQL databases are strong potential contributors of valuable linked open data. Hence, the object of this thesis is to enable RDF-based data integration over heterogeneous data sources and, in particular, to harness NoSQL databases to populate the Web of Data. We propose a generic mapping language, xR2RML, to describe the mapping of heterogeneous data sources into an arbitrary RDF representation. xR2RML relies on and extends previous works on the translation of RDBs, CSV/TSV and XML into RDF. With such an xR2RML mapping, we propose either to materialize RDF data or to dynamically evaluate SPARQL queries on the native database. In the latter, we follow a two-step approach. The first step performs the translation of a SPARQL query into a pivot abstract query based on the xR2RML mapping of the target database to RDF. In the second step, the abstract query is translated into a concrete query, taking into account the specificities of the database query language. Great care is taken of the query optimization opportunities, both at the abstract and the concrete levels. To demonstrate the effectiveness of our approach, we have developed a prototype implementation for MongoDB, the popular NoSQL document store. We have validated the method using a real-life use case in Digital Humanities
Arnaud, Bérenger. "Exploitation et partage de données hétérogènes et dynamiques." Thesis, Montpellier 2, 2013. http://www.theses.fr/2013MON20025/document.
Full textIn the context of numeric data, the software development costs entail a number of cost factors. In contrast, adapting generic tools has its own set of costs, requiring developer's integration and final user's adaptation. The aim of our approach is to consider the different points of interaction with the data to improve the exploitation of data, whether provided or generated from collaboration.The definitions and problems related to data are dependent upon the domain from which the data come and the treatment that have been applied to them. In this work we have opted for a holistic approach where we consider the range of angles. The result is a summary of the emergent concepts and domain equivalences.The first contribution consists of improving collaborative document mark-up. Two improvements are proposed by out tool – Coviz –. 1) Resource tagging which is unique to each user, who organises their own labels according to their personal poly-hierarchy. Each user may take into consideration other users approaches through sharing of tags. The system supplies additional context through a harvesting of documents in open archives. 2) The tool applies the concept of facets to the interface and then combines them to provide a search by keyword or characteristic selection. This point is shared by all users and the actions of an individual user impact the whole group.The major contribution, which is confidential, is a framework christened DIP for Data Interaction and Presentation. Its goal is to increase the freedom of expression of the user over the interaction and access to data. It reduces the hardware and software constrains by adding a new access point between the user and the raw data as well as generic pivots. From a final point of view the user gains in expression of filtering, in sharing, in state persistence of the navigator, in automation of day-to-day tasks, etc.DIP has been stress tested under real-life conditions of users and limited resources with the software KeePlace. Acknowledgement is given to KeePlace who initiated this thesis
Zhang, Bo. "Reconnaissance de stress à partir de données hétérogènes." Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0113/document.
Full textIn modern society, the stress of an individual has been found to be a common problem. Continuous stress can lead to various mental and physical problems and especially for the people who always face emergency situations (e.g., fireman): it may alter their actions and put them in danger. Therefore, it is meaningful to provide the assessment of the stress of an individual. Based on this idea, the Psypocket project is proposed which is aimed at making a portable system able to analyze accurately the stress state of an individual based on his physiological, psychological and behavioural modifications. It should then offer solutions for feedback to regulate this state.The research of this thesis is an essential part of the Psypocket project. In this thesis, we discuss the feasibility and the interest of stress recognition from heterogeneous data. Not only physiological signals, such as Electrocardiography (ECG), Electromyography (EMG) and Electrodermal activity (EDA), but also reaction time (RT) are adopted to recognize different stress states of an individual. For the stress recognition, we propose an approach based on a SVM classifier (Support Vector Machine). The results obtained show that the reaction time can be used to estimate the level of stress of an individual in addition or not to the physiological signals. Besides, we discuss the feasibility of an embedded system which would realize the complete data processing. Therefore, the study of this thesis can contribute to make a portable system to recognize the stress of an individual in real time by adopting heterogeneous data like physiological signals and RT
Zhang, Bo. "Reconnaissance de stress à partir de données hétérogènes." Electronic Thesis or Diss., Université de Lorraine, 2017. http://www.theses.fr/2017LORR0113.
Full textIn modern society, the stress of an individual has been found to be a common problem. Continuous stress can lead to various mental and physical problems and especially for the people who always face emergency situations (e.g., fireman): it may alter their actions and put them in danger. Therefore, it is meaningful to provide the assessment of the stress of an individual. Based on this idea, the Psypocket project is proposed which is aimed at making a portable system able to analyze accurately the stress state of an individual based on his physiological, psychological and behavioural modifications. It should then offer solutions for feedback to regulate this state.The research of this thesis is an essential part of the Psypocket project. In this thesis, we discuss the feasibility and the interest of stress recognition from heterogeneous data. Not only physiological signals, such as Electrocardiography (ECG), Electromyography (EMG) and Electrodermal activity (EDA), but also reaction time (RT) are adopted to recognize different stress states of an individual. For the stress recognition, we propose an approach based on a SVM classifier (Support Vector Machine). The results obtained show that the reaction time can be used to estimate the level of stress of an individual in addition or not to the physiological signals. Besides, we discuss the feasibility of an embedded system which would realize the complete data processing. Therefore, the study of this thesis can contribute to make a portable system to recognize the stress of an individual in real time by adopting heterogeneous data like physiological signals and RT
Hamdoun, Khalfallah Sana. "Construction d'entrepôts de données par intégration de sources hétérogènes." Paris 13, 2006. http://www.theses.fr/2006PA132039.
Full textThis work describes the construction of a data warehouse by the integration of heterogeneous data. These latter could be structured, semi-structured or unstructured. We propose a theoretical approach based on an integration environment definition. This environment is formed by data sources and inter-schema relationships between these sources ( equivalence and strict order relations). Our approach is composed of five steps allowing data warehouse component choice, global schema generation and construction of data warehouse views. Multidimensional schemas are also proposed. All the stages proposed in this work are implemented by the use of a functional prototype (using SQL and Xquery). Keywords Data Integration, data warehouses, heterogeneous data, inter-schema relationships, Relational, Object-relational, XML, SQL, Xquery
Badri, Mohamed. "Maintenance des entrepôts de données issus de sources hétérogènes." Paris 5, 2008. http://www.theses.fr/2008PA05S006.
Full textThis work has been performed in the field of data warehouses (DW). DW are in the core of Decision making information system and are used to support decision making tools (OLAP, data mining, reporting). A DW is an alive entity which content is continuously fed and refreshed. Updating aggregates of DW is crucial for the decision making. That is why the DW maintenance has a strategic place in the decision system process. It is also used as a performance criterion of a DW system. Since the communication technologies especially Internet are steadily growing, data are becoming more and more heterogeneous and distributed. We can classify them in three categories: structured data, semi-structured data and unstructured data. In this work we are presenting first a modelling approach with the aim of integrating all this data. On the bases of this approach, we are thereafter proposing a process that insures an incremental warehouse data and aggregates maintenance. We are also proposing a tree structure to manage aggregates as well as algorithms that insure its evolution. Being in the context of heterogeneity, all our proposals are independent of the warehouse model and of its management system. In order to validate our contribution, the Heterogeneous Data Integration and Maintenance (HDIM) prototype has been developped and some experiments performed
Gürgen, Levent. "Gestion à grande échelle de données de capteurs hétérogènes." Grenoble INPG, 2007. http://www.theses.fr/2007INPG0093.
Full textThis dissertation deals with the issues related to scalable management of heterogeneous sensor data. Ln fact, sensors are becoming less and less expensive, more and more numerous and heterogeneous. This naturally raises the scalability problem and the need for integrating data gathered from heterogeneous sensors. We propose a distributed and service-oriented architecture in which data processing tasks are distributed at severallevels in the architecture. Data management functionalities are provided in terms of "services", in order to hide sensor heterogeneity behind generic services. We equally deal with system management issues in sensor farms, a subject not yet explored in this context
Jautzy, Olivier. "Intégration de sources de données hétérogènes : Une approche langage." Marne-la-vallée, ENPC, 2000. http://www.theses.fr/2000ENPC0002.
Full textCavalier, Mathilde. "La propriété des données de santé." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE3071/document.
Full textThe question of the protection and enhancement of health data is subject to a permanent renewal because it appears to be in the middle of some conflicting interests. Legal, health and economic logics confront and express themselves through a particularly heterogenous set of regulations on health data. Property rights here seem able to reconcile these issues that first look contradictory appearance issues. Given the place of this right in our legal system and uniqueness of health data, the study of their reconciliation deserves a study of some magnitude. This is a first step to ensure the compatibility of this law with health data. The answer requires a vision of simplified property only to find that the existing rights of the data is already in the property rights but which, because of the particularity of health data, are largely limited. Secondly, therefore the question of the relevance of the application of "more complete" property rights applies to health data. However, we note that the specificity of health data implies that such a the solution is not the most effective for achieving a fair balance between patients and data collectors. Nevertheless, other solutions are possible
Giersch, Arnaud. "Ordonnancement sur plates-formes hétérogènes de tâches partageant des données." Phd thesis, Université Louis Pasteur - Strasbourg I, 2004. http://tel.archives-ouvertes.fr/tel-00008222.
Full textBavueza, Munsana Dia Lemfu. "Ravir : un système de coopération des bases de données hétérogènes." Montpellier 2, 1987. http://www.theses.fr/1987MON20265.
Full textNaacke, Hubert. "Modèle de coût pour médiateur de bases de données hétérogènes." Versailles-St Quentin en Yvelines, 1999. http://www.theses.fr/1999VERS0013.
Full textLes systemes distribues accedent a des sources d'informations diverses au moyen de requetes declaratives. Une solution pour resoudre les problemes lies a l'heterogeneite des sources repose sur l'architecture mediateur / adaptateurs. Dans cette architecture, le mediateur accepte en entree une requete de l'utilisateur, la traite en accedant aux sources via les adaptateurs concernes et renvoie la reponse a l'utilisateur. Le mediateur offre une vue globale et centralisee des sources. Les adaptateurs offrent un acces uniforme aux sources, au service du mediateur. Pour traiter une requete de maniere efficace, le mediateur doit optimiser le plan decrivant le traitement de la requete. Pour cela, plusieurs plans semantiquement equivalents sont envisages, le cout (i. E. Le temps de reponse) de chaque plan est estime afin de choisir celui de moindre cout qui sera execute. Le mediateur estime le cout des operations traitees par les sources en utilisant les informations de cout que les sources exportent. Or, a cause de l'autonomie des sources, les informations exportees peuvent s'averer insuffisantes pour estimer le cout des operations avec une precision convenable. Cette these propose une nouvelle methode permettant au developpeur d'adaptateur d'exporter un modele de cout d'une source a destination du mediateur. Le modele exporte contient des statistiques qui decrivent les donnees stockees dans la source ainsi que des fonctions mathematiques pour evaluer le cout des traitements effectues par la source. Lorsque le developpeur d'adaptateur manque d'information ou de moyen, il a la possibilite de fournir un modele de cout partiel qui est automatiquement complete avec le modele generique predefini au sein du mediateur. Nous validons experimentalement le modele de cout propose en accedant a des sources web. Cette validation montre l'efficacite du modele de cout generique ainsi que celle des modeles plus specialises selon les particularites des sources et les cas d'applications
Durand, Guillermo. "Tests multiples et bornes post hoc pour des données hétérogènes." Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS289/document.
Full textThis manuscript presents my contributions in three areas of multiple testing where data heterogeneity can be exploited to better detect false null hypotheses or improve signal detection while controlling false positives: p-value weighting, discrete tests, and post hoc inference. First, a new class of data-driven weighting procedures, incorporating group structure and true null proportion estimators, is defined, and its False Discovery Rate (FDR) control is proven asymptotically. This procedure also achieves power optimality under some conditions on the proportion estimators. Secondly, new step-up and step-down procedures, tailored for discrete tests under independence, are designed to control the FDR for arbitrary p-value null marginals. Finally, new confidence bounds for post hoc inference (called post hoc bounds), tailored for the case where the signal is localized, are studied, and the associated optimal post hoc bounds are derived with a simple algorithm
Dematraz, Jessica. "Méthodologies d'extraction des connaissances issues de données hétérogènes pour l'innovation." Thesis, Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0716.
Full textIn the age of Big Data, where information and communication technologies are in full swing, access to information has never been so easy and fast. Paradoxically, strategic information, that is, "useful" information, the information that facilitates decision-making, has never been so rare and difficult to find. Hence the importance of setting up a process of competitive intelligence and more precisely of information monitoring, in order to effectively exploit the information environment of an organization, a sector or even an entire country. Today, the predominance of information in a professional context is no longer to be proven. The monitoring issues as they are (strategic, competitive, technological, regulatory, etc.) concern entities of all sectors (public or private) and sizes (SMEs, ETIs, large groups) in all fields of activity. Except that there is no single method applicable to everything and for everyone, but a plurality of methods that must coexist to achieve the emergence of knowledge
Dematraz, Jessica. "Méthodologies d'extraction des connaissances issues de données hétérogènes pour l'innovation." Electronic Thesis or Diss., Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0716.
Full textIn the age of Big Data, where information and communication technologies are in full swing, access to information has never been so easy and fast. Paradoxically, strategic information, that is, "useful" information, the information that facilitates decision-making, has never been so rare and difficult to find. Hence the importance of setting up a process of competitive intelligence and more precisely of information monitoring, in order to effectively exploit the information environment of an organization, a sector or even an entire country. Today, the predominance of information in a professional context is no longer to be proven. The monitoring issues as they are (strategic, competitive, technological, regulatory, etc.) concern entities of all sectors (public or private) and sizes (SMEs, ETIs, large groups) in all fields of activity. Except that there is no single method applicable to everything and for everyone, but a plurality of methods that must coexist to achieve the emergence of knowledge
Mahfoudi, Abdelwahab. "Contribution a l'algorithmique pour l'analyse des bases de données statistiques hétérogènes." Dijon, 1995. http://www.theses.fr/1995DIJOS009.
Full textRenard, Hélène. "Equilibrage de charge et redistribution de données sur plates-formes hétérogènes." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2005. http://tel.archives-ouvertes.fr/tel-00012133.
Full textFereres, Yohan. "Stratégies d'arbitrage systématique multi-classes d'actifs et utilisation de données hétérogènes." Phd thesis, Université Paris-Est, 2013. http://tel.archives-ouvertes.fr/tel-00987635.
Full textManolescu, Goujot Ioana Gabriela. "Techniques d'optimisation pour l'interrogation des sources de données hétérogènes et distribuées." Versailles-St Quentin en Yvelines, 2001. http://www.theses.fr/2001VERS0027.
Full textClaeys, Emmanuelle. "Clusterisation incrémentale, multicritères de données hétérogènes pour la personnalisation d’expérience utilisateur." Thesis, Strasbourg, 2019. http://www.theses.fr/2019STRAD039.
Full textIn many activity sectors (health, online sales,...) designing from scratch an optimal solution for a defined problem (finding a protocol to increase the cure rate, designing a web page to promote the purchase of one or more products,...) is often very difficult or even impossible. In order to face this difficulty, designers (doctors, web designers, production engineers,...) often work incrementally by successive improvements of an existing solution. However, defining the most relevant changes remains a difficult problem. Therefore, a solution adopted more and more frequently is to compare constructively different alternatives (also called variations) in order to determine the best one by an A/B Test. The idea is to implement these alternatives and compare the results obtained, i.e. the respective rewards obtained by each variation. To identify the optimal variation in the shortest possible time, many test methods use an automated dynamic allocation strategy. Its allocate the tested subjects quickly and automatically to the most efficient variation, through a learning reinforcement algorithms (as one-armed bandit methods). These methods have shown their interest in practice but also limitations, including in particular a latency time (i.e. a delay between the arrival of a subject to be tested and its allocation) too long, a lack of explicitness of choices and the integration of an evolving context describing the subject's behaviour before being tested. The overall objective of this thesis is to propose a understable generic A/B test method allowing a dynamic real-time allocation which take into account the temporals static subjects’s characteristics
Essid, Mehdi. "Intégration des données et applications hétérogènes et distribuées sur le Web." Aix-Marseille 1, 2005. http://www.theses.fr/2005AIX11035.
Full textRenard, Hélène. "Équilibrage de charge et redistribution de données sur plates-formes hétérogènes." Lyon, École normale supérieure (sciences), 2005. http://www.theses.fr/2005ENSL0344.
Full textIn this thesis, we study iterative algorithms onto heterogeneous platforms. These iterative algorithms operate on large data samples (recursive convolution, image processing algorithms, etc. ). At each iteration, independent calculations are carried out in parallel, and some communications take place. An abstract view of the problem is the following: the iterative algorithm repeatedly operates on a large rectangular matrix of data samples. This data matrix is split into vertical (or horizontal) slices that are allocated to the processors. At each step of the algorithm, the slices are updated locally, and then boundary information is exchanged between consecutive slices. This (virtual) geometrical constraint advocates that processors be organized as a virtual ring. Then each processor will only communicate twice, once with its (virtual) predecessor in the ring, and once with its successor. Note that there is no reason a priori to restrict to a uni-dimensional partitioning of the data, and to map it onto a uni-dimensional ring of processors. But uni-dimensional partitionings are very natural for most applications, and, as will be shown in this thesis, the problem to find the optimal one is already very difficult. After dealing with the problems of mapping and load-balancing onto heterogeneous platforms, we consider the problem of redistributing data onto these platforms, an operation induced by possible variations in the resource performances (CPU speed, communication bandwidth) or in the system/application requirements (completed tasks, new tasks, migrated tasks, etc. ). For homogeneous rings the problem has been completely solved. Indeed, we have designed optimal algorithms, and provided formal proofs of correctness, both for unidirectional and bidirectional rings. For heterogeneous rings there remains further research to be conducted. The unidirectional case was easily solved, but the bidirectional case remains open. Still, we have derived an optimal solution for light redistributions, an important case in practice
Fereres, Yohan. "Stratégies d’arbitrage systématique multi-classes d'actifs et utilisation de données hétérogènes." Thesis, Paris Est, 2013. http://www.theses.fr/2013PEST0075/document.
Full textFinancial markets evolve more or less rapidly and strongly to all kind of information depending on time period of study. In this context, we intend to measure a broad set of information influence on systematic multi-assets classes “euro neutral” arbitrage portfolios either for “naive” diversification and optimal diversification. Our research focuses on systematic tactical asset allocation and we group these information under the name of heterogeneous data (market data and “other market information”). Market data are “end of day” asset closing prices and “other market information” gather economic cycle, sentiment and volatility indicators. We assess the influence of a heterogeneous data combination on our arbitrage portfolios for a time period including the subprimes crisis period and thanks to data analysis and quantization algorithms. The impact of a heterogeneous data combination on our arbitrage portfolio is materialized by increasing return, increasing return/volatility ratio for the post subprimes crisis period, decreasing volatility and asset class correlations. These empirical findings suggest that “other market information” presence could be an element of arbitrage portfolio risk diversification. Furthermore, we investigate and bring empirical results to Blitz and Vliet (2008) issue on global tactical asset allocation (GTAA) by considering “predictive” variables with a systematic market timing process integrating heterogeneous data thanks to a quantitative data processing
Fize, Jacques. "Mise en correspondance de données textuelles hétérogènes fondée sur la dimension spatiale." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTS099.
Full textWith the rise of Big Data, the processing of Volume, Velocity (growth and evolution) and data Variety concentrates the efforts of communities to exploit these new resources. These new resources have become so important that they are considered the new "black gold". In recent years, volume and velocity have been aspects of the data that are controlled, unlike variety, which remains a major challenge. This thesis presents two contributions in the field of heterogeneous data matching, with a focus on the spatial dimension.The first contribution is based on a two-step process for matching heterogeneous textual data: georepresentation and geomatching. In the first phase, we propose to represent the spatial dimension of each document in a corpus through a dedicated structure, the Spatial Textual Representation (STR). This graph representation is composed of the spatial entities identified in the document, as well as the spatial relationships they maintain. To identify the spatial entities of a document and their spatial relationships, we propose a dedicated resource, called Geodict. The second phase, geomatching, computes the similarity between the generated representations (STR). Based on the nature of the STR structure (i.e. graph), different algorithms of graph matching were studied. To assess the relevance of a match, we propose a set of 6 criteria based on a definition of the spatial similarity between two documents.The second contribution is based on the thematic dimension of textual data and its participation in the spatial matching process. We propose to identify the themes that appear in the same contextual window as certain spatial entities. The objective is to induce some of the implicit spatial similarities between the documents. To do this, we propose to extend the structure of STR using two concepts: the thematic entity and the thematic relationship. The thematic entity represents a concept specific to a particular field (agronomic, medical) and represented according to different spellings present in a terminology resource, in this case a vocabulary. A thematic relationship links a spatial entity to a thematic entity if they appear in the same window. The selected vocabularies and the new form of STR integrating the thematic dimension are evaluated according to their coverage on the studied corpora, as well as their contributions to the heterogeneous textual matching process on the spatial dimension
Germa, Thierry. "Fusion de données hétérogènes pour la perception de l'homme par robot mobile." Phd thesis, Toulouse 3, 2010. http://thesesups.ups-tlse.fr/1016/.
Full textThis work has been realized under the CommRob European project involving several academic and industrial partners. The goal of this project is to build a robot companion able to act in structured and dynamic environments cluttered by other agents (robots and humans). In this context, our contribution is related to multimodal perception of humans from the robot (users and passers-by). The multimodal perception induces the development and integration of perceptual functions able to detect, to identify the people and to track the motions in order to communicate with the robot. Proximal detection of the robot's users uses a multimodal perception framework based on heterogeneous data fusion from different sensors. The detected and identified users are then tracked in the video stream extracted from the embedded camera in order to interpret the human motions. The first contribution is related to the definition of perceptual functions for detecting and identifying humans from a mobile robot. The second contribution concerns the spatio-temporal analysis of these percepts for user tracking. Then, this work is extended to multi-target tracking dedicated to the passers by. Finally, as it is frequently done in robotics, our work contains two main topics: on one hand the approaches are formalized; on the other hand, these approaches are integrated and validated through live experiments. All the developments done during this thesis has been integrated on our platform Rackham and on the CommRob platform too
Soumana, Ibrahim. "Interrogation des sources de données hétérogènes : une approche pour l'analyse des requêtes." Thesis, Besançon, 2014. http://www.theses.fr/2014BESA1015/document.
Full textNo english summary available
Allanic, Marianne. "Gestion et visualisation de données hétérogènes multidimensionnelles : application PLM à la neuroimagerie." Thesis, Compiègne, 2015. http://www.theses.fr/2015COMP2248/document.
Full textNeuroimaging domain is confronted with issues in analyzing and reusing the growing amount of heterogeneous data produced. Data provenance is complex – multi-subjects, multi-methods, multi-temporalities – and the data are only partially stored, restricting multimodal and longitudinal studies. Especially, functional brain connectivity is studied to understand how areas of the brain work together. Raw and derived imaging data must be properly managed according to several dimensions, such as acquisition time, time between two acquisitions or subjects and their characteristics. The objective of the thesis is to allow exploration of complex relationships between heterogeneous data, which is resolved in two parts : (1) how to manage data and provenance, (2) how to visualize structures of multidimensional data. The contribution follow a logical sequence of three propositions which are presented after a research survey in heterogeneous data management and graph visualization. The BMI-LM (Bio-Medical Imaging – Lifecycle Management) data model organizes the management of neuroimaging data according to the phases of a study and takes into account the scalability of research thanks to specific classes associated to generic objects. The application of this model into a PLM (Product Lifecycle Management) system shows that concepts developed twenty years ago for manufacturing industry can be reused to manage neuroimaging data. GMDs (Dynamic Multidimensional Graphs) are introduced to represent complex dynamic relationships of data, as well as JGEX (Json Graph EXchange) format that was created to store and exchange GMDs between software applications. OCL (Overview Constraint Layout) method allows interactive and visual exploration of GMDs. It is based on user’s mental map preservation and alternating of complete and reduced views of data. OCL method is applied to the study of functional brain connectivity at rest of 231 subjects that are represented by a GMD – the areas of the brain are the nodes and connectivity measures the edges – according to age, gender and laterality : GMDs are computed through processing workflow on MRI acquisitions into the PLM system. Results show two main benefits of using OCL method : (1) identification of global trends on one or many dimensions, and (2) highlights of local changes between GMD states
Imbert, Alyssa. "Intégration de données hétérogènes complexes à partir de tableaux de tailles déséquilibrées." Thesis, Toulouse 1, 2018. http://www.theses.fr/2018TOU10022/document.
Full textThe development of high-throughput sequencing technologies has lead to a massive acquisition of high dimensional and complex datasets. Different features make these datasets hard to analyze : high dimensionality, heterogeneity at the biological level or at the data type level, the noise in data (due to biological heterogeneity or to errors in data) and the presence of missing data (for given values or for an entire individual). The integration of various data is thus an important challenge for computational biology. This thesis is part of a large clinical research project on obesity, DiOGenes, in which we have developed methods for data analysis and integration. The project is based on a dietary intervention that was led in eight Europeans centers. This study investigated the effect of macronutrient composition on weight-loss maintenance and metabolic and cardiovascular risk factors after a phase of calorie restriction in obese individuals. My work have mainly focused on transcriptomic data analysis (RNA-Seq) with missing individuals and data integration of transcriptomic (new QuantSeq protocol) and clinic datasets. The first part is focused on missing data and network inference from RNA-Seq datasets. During longitudinal study, some observations are missing for some time step. In order to take advantage of external information measured simultaneously to RNA-Seq data, we propose an imputation method, hot-deck multiple imputation (hd-MI), that improves the reliability of network inference. The second part deals with an integrative study of clinical data and transcriptomic data, measured by QuantSeq, based on a network approach. The new protocol is shown efficient for transcriptome measurement. We proposed an analysis based on network inference that is linked to clinical variables of interest
Lange, Benoît. "Visualisation interactive de données hétérogènes pour l'amélioration des dépenses énergétiques du bâtiment." Thesis, Montpellier 2, 2012. http://www.theses.fr/2012MON20172/document.
Full textEnergy efficiencies are became a major issue. Building from any country have been identified as gap of energy, building are not enough insulated and energy loss by this struc- ture represent a major part of energy expenditure. RIDER has emerged from this viewpoint, RIDER for Research for IT Driven EneRgy efficiency. This project has goal to develop a new kind of IT system to optimize energy consumption of buildings. This system is based on a component paradigm, which is composed by a pivot model, a data warehouse with a data mining approach and a visualization tool. These two last components are developed to improve content of pivot model.In this manuscript, our focus was on the visualization part of the project. This manuscript is composed in two parts: state of the arts and contributions. Basic notions, a visualization chapter and a visual analytics chapter compose the state of the art. In the contribution part, we present data model used in this project, visualization proposed and we conclude with two experimentations on real data
Elghazel, Haytham. "Classification et prévision des données hétérogènes : application aux trajectoires et séjours hospitaliers." Lyon 1, 2007. http://www.theses.fr/2007LYO10325.
Full textRecent years have seen the development of data mining techniques in various application areas, with the purpose of analyzing large and complex data. The medical field is one of these areas where available data are numerous and described using various attributes, classical (like patient age and sex) or symbolic (like medical treatments and diagnosis). Data mining generally includes either descriptive techniques (which provide an attractive mechanism to automatically find the hidden structure of large data sets), or predictive techniques (able to unearth hidden knowledge from datasets). In this work, the problem of clustering and prediction of heterogeneous data is tackled by a two‐stage proposal. The first one concerns a new clustering approach which is based on a graph coloring method, named b‐coloring. An extension of this approach which concerns incremental clustering has been added at the same time. It consists in updating clusters as new data are added to the dataset without having to perform complete re‐clustering. The second proposal concerns sequential data analysis and provides a new framework for clustering sequential data based on a hybrid model that uses the previous clustering approach and the Mixture Markov chain models. This method allows building a partition of the sequential dataset into cohesive and easily interpretable clusters, as well as it is able to predict the evolution of sequences from one cluster. Both proposals have then been applied to healthcare data given from the PMSI program (French hospital information system), in order to assist medical professionals in their decision process. In the first step, the b‐coloring clustering algorithm has been investigated to provide a new typology of hospital stays as an alternative to the DRGs classification (Diagnosis Related Groups). In a second step, we defined a typology of clinical pathways and are then able to predict possible features of future paths when a new patient arrives at the clinical center. The overall framework provides a decision‐aid system for assisting medical professionals in the planning and management of clinical process
Guillemot, Vincent. "Application de méthodes de classification supervisée et intégration de données hétérogènes pour des données transcriptomiques à haut-débit." Phd thesis, Université Paris Sud - Paris XI, 2010. http://tel.archives-ouvertes.fr/tel-00481822.
Full textNajjar, Ahmed. "Forage de données de bases administratives en santé." Doctoral thesis, Université Laval, 2017. http://hdl.handle.net/20.500.11794/28162.
Full textCurrent health systems are increasingly equipped with data collection and storage systems. Therefore, a huge amount of data is stored in medical databases. Databases, designed for administrative or billing purposes, are fed with new data whenever the patient uses the healthcare system. This specificity makes these databases a rich source of information and extremely interesting. These databases can unveil the constraints of reality, capturing elements from a great variety of real medical care situations. So, they could allow the conception and modeling the medical treatment process. However, despite the obvious interest of these administrative databases, they are still underexploited by researchers. In this thesis, we propose a new approach of the mining for administrative data to detect patterns from patient care trajectories. Firstly, we have proposed an algorithm able to cluster complex objects that represent medical services. These objects are characterized by a mixture of numerical, categorical and multivalued categorical variables. We thus propose to extract one projection space for each multivalued variable and to modify the computation of the distance between the objects to consider these projections. Secondly, a two-step mixture model is proposed to cluster these objects. This model uses the Gaussian distribution for the numerical variables, multinomial for the categorical variables and the hidden Markov models (HMM) for the multivalued variables. Finally, we obtain two algorithms able to cluster complex objects characterized by a mixture of variables. Once this stage is reached, an approach for the discovery of patterns of care trajectories is set up. This approach involves the followed steps: 1. preprocessing that allows the building and generation of medical services sets. Thus, three sets of medical services are obtained: one for hospital stays, one for consultations and one for visits. 2. modeling of treatment processes as a succession of labels of medical services. These complex processes require a sophisticated method of clustering. Thus, we propose a clustering algorithm based on the HMM. 3. creating an approach of visualization and analysis of the trajectory patterns to mine the discovered models. All these steps produce the knowledge discovery process from medical administrative databases. We apply this approach to databases for elderly patients over 65 years old who live in the province of Quebec and are suffering from heart failure. The data are extracted from the three databases: the MSSS MED-ÉCHO database, the RAMQ bank and the database containing death certificate data. The obtained results clearly demonstrated the effectiveness of our approach by detecting special patterns that can help healthcare administrators to better manage health treatments.
Pinilla, Erwan. "Données de santé, dynamiques et enjeux de souveraineté." Electronic Thesis or Diss., Strasbourg, 2023. http://www.theses.fr/2023STRAA015.
Full textAim of this research is to identify the dynamics of “health data” in the field of digital sovereignty: who can use it to describe and explain situations, predict trends, and induce individual and/or population, or even States, behaviours ? What is – and should be legally protected, and how ? We here report on and analyze the overflowing of historical approaches to regulation, due to the diversification of players, techniques and uses ; the multiplication of data sources and their dissemination, the shaking of legal categories despite their recent establishment ; the porosity of national and joint systems, due to conventional or agressive interactions. As a result, we analyze the accelerated advent of new rules at European level in traditionally regalian fields of cyber infrastructure, qualifications (data, technologies, uses), and mutual guarantees against interferences. Other challenges call for in-depth insight (such as reidentification & synthetic data), in an era where for long technological domination is no more a prerogative of States, and where geopolitics has been extended by new tools and practices
Cherif, Mohamed Abderrazak. "Alignement et fusion de cartes géospatiales multimodales hétérogènes." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ5002.
Full textThe surge in data across diverse fields presents an essential need for advanced techniques to merge and interpret this information. With a special emphasis on compiling geospatial data, this integration is crucial for unlocking new insights from geographic data, enhancing our ability to map and analyze trends that span across different locations and environments with more authenticity and reliability. Existing techniques have made progress in addressing data fusion; however, challenges persist in fusing and harmonizing data from different sources, scales, and modalities.This research presents a comprehensive investigation into the challenges and solutions in vector map alignment and fusion, focusing on developing methods that enhance the precision and usability of geospatial data. We explored and developed three distinct methodologies for polygonal vector map alignment: ProximityAlign, which excels in precision within urban layouts but faces computational challenges; the Optical Flow Deep Learning-Based Alignment, noted for its efficiency and adaptability; and the Epipolar Geometry-Based Alignment, effective in data-rich contexts but sensitive to data quality. Additionally, our study delved into linear feature map alignment, emphasizing the importance of precise alignment and feature attribute transfer, pointing towards the development of richer, more informative geospatial databases by adapting the ProximityAlign approach for linear features like fault traces and road networks. The fusion aspect of our research introduced a sophisticated pipeline to merge polygonal geometries relying on space partitioning, non-convex optimization of graph data structure, and geometrical operations to produce a reliable fused map that harmonizes input vector maps, maintaining their geometric and topological integrity.In practice, the developed framework has the potential to improve the quality and usability of integrated geospatial data, benefiting various applications such as urban planning, environmental monitoring, and disaster management. This study not only advances theoretical understanding in the field but also provides a solid foundation for practical applications in managing and interpreting large-scale geospatial datasets
Zorn, Caroline. "Données de santé et secret partagé : pour un droit de la personne à la protection de ses données de santé partagées." Thesis, Nancy 2, 2009. http://www.theses.fr/2009NAN20011.
Full textThe medical professional secret is a legal exception to the professional secret; it allows a patient's caregivers to exchange health information that is relevant to that patient's care without being punished for revealing confidential information. That caregivers discuss patient's health information with other medical professional involved in that patient's care is to the benefit of the patient. Nonetheless, there is a fine balance to be struck between a "need to know" professional exchange of information, which is essential to care of the patient, and a broad exchange of information, which may ultimately comprise the confidentiality of the patient's private life. The emergence of an electronic tool, which multiplies the potential possibilities for data exchange, further disrupts this balance. Consequently, the manipulation of this shared health information must be subject to the medical professional secret, the "Informatique et Libertés" legislation, and all of the numerous norms and standards as defined by the French national electronic medical record (DMP), the pharmaceutical medical record (Dossier pharmaceutique), or the reimbursement repository (Historique des remboursements). As the patient's health information is increasingly shared between health care providers - through means such as the DMP or DP - the patient's right and ability to control the access to his/her health information have to become more and more important. A study regarding the importance of obtaining the patient's consent lead to the following proposal: to inscribe in the French Constitution the patient's right to confidentiality regarding health information
Vandromme, Maxence. "Optimisation combinatoire et extraction de connaissances sur données hétérogènes et temporelles : application à l’identification de parcours patients." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10044.
Full textHospital data exhibit numerous specificities that make the traditional data mining tools hard to apply. In this thesis, we focus on the heterogeneity associated with hospital data and on their temporal aspect. This work is done within the frame of the ANR ClinMine research project and a CIFRE partnership with the Alicante company. In this thesis, we propose two new knowledge discovery methods suited for hospital data, each able to perform a variety of tasks: classification, prediction, discovering patients profiles, etc.In the first part, we introduce MOSC (Multi-Objective Sequence Classification), an algorithm for supervised classification on heterogeneous, numeric and temporal data. In addition to binary and symbolic terms, this method uses numeric terms and sequences of temporal events to form sets of classification rules. MOSC is the first classification algorithm able to handle these types of data simultaneously. In the second part, we introduce HBC (Heterogeneous BiClustering), a biclustering algorithm for heterogeneous data, a problem that has never been studied so far. This algorithm is extended to support temporal data of various types: temporal events and unevenly-sampled time series. HBC is used for a case study on a set of hospital data, whose goal is to identify groups of patients sharing a similar profile. The results make sense from a medical viewpoint; they indicate that relevant, and sometimes new knowledge is extracted from the data. These results also lead to further, more precise case studies. The integration of HBC within a software is also engaged, with the implementation of a parallel version and a visualization tool for biclustering results
Morvan, Marie. "Modèles de régression pour données fonctionnelles hétérogènes : application à la modélisation de données de spectrométrie dans le moyen infrarouge." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S097.
Full textIn many application fields, data corresponds to curves. This work focuses on the analysis of spectrometric curves, composed of hundreds of ordered variables that corresponds to the absorbance values measured for each wavenumber. In this context, an automatic statistical procedure is developped, that aims at building a prediction model taking into account the heterogeneity of the observed data. More precisely, a diagnosis tool is built in order to predict a metabolic disease from spectrometric curves measured on a population composed of patients with differents profile. The procedure allows to select portions of curves relevant for the prediction and to build a partition of the data and a sparse predictive model simultaneously, using a mixture of penalized regressions suitable for functional data. In order to study the complexity of the data and of the application case, a method to better understand and display the interactions between variables is built. This method is based on the study of the covariance matrix structure, and aims to highlight the dependencies between blocks of variables. A medical example is used to present the method and results, and allows the use of specific visualization tools
Branki, Mohamed Tarek. "Un Processus d'integration de bases de données spatiales hétérogènes par logique de description." Paris 13, 1998. http://www.theses.fr/1998PA132055.
Full textKaakai, Sarah. "Nouveaux paradigmes en dynamique de populations hétérogènes : modélisation trajectorielle, agrégation, et données empiriques." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066553/document.
Full textThis thesis deals with the probabilistic modeling of heterogeneity in human populations and of its impact on longevity. Over the past few years, numerous studies have shown a significant increase in geographical and socioeconomic inequalities in mortality. New issues have emerged from this paradigm shift that traditional demographic models are not able solve, and whose formalization requires a careful analysis of the data, in a multidisciplinary environment. Using the framework of population dynamics, this thesis aims at illustrating this complexity according to different points of view: We explore the link between heterogeneity and non-linearity in the presence of composition changes in the population, from a mathematical modeling viewpoint. The population dynamics, called Birth Death Swap, is built as the solution of a stochastic equation driven by a Poisson measure, using a more general pathwise comparison result. When swaps occur at a faster rate than demographic events, an averaging result is obtained by stable convergence and comparison. In particular, the aggregated population converges towards a nonlinear dynamic. In the second part, the impact of heterogeneity on aggregate mortality is studied from an empirical viewpoint, using English population data structured by age and socioeconomic circumstances. Based on numerical simulations, we show how a cause of death reduction could be compensated in presence of heterogeneity. The last point of view is an interdisciplinary survey on the determinants of longevity, accompanied by an analysis on the evolution of tools to analyze it and on new modeling issues in the face of this paradigm shift
Galbaud, du Fort Guillaume. "Epidémiologie et santé mentale du couple : etude comparée de données populationnelles et de données cliniques." Thesis, McGill University, 1991. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=59993.
Full textThe primary results from the study of 845 couples in the general population suggest that there exists a significant spouse-similarity across the various mental health dimensions examined (psychological distress, general well-being, and role satisfaction).
The main results from the study of 17 couples in marital therapy suggest that significant sex differences exist in dyadic adjustment. Sex differences were also noted in the correlations between dyadic adjustment and depressive symptoms.
In conclusion, it appears that epidemiological research on the mental health of couples should have as its objective a simultaneous consideration of both the individual and the couple, as well as a simultaneous consideration of clinical and general populations, in order to create a double complementarity out of this apparent double dichotomy.
Mommessin, Clément. "Gestion efficace des ressources dans les plateformes hétérogènes." Thesis, Université Grenoble Alpes, 2020. https://tel.archives-ouvertes.fr/tel-03179102.
Full textThe world of Information Technology (IT) is in constant evolution.With the explosion of the number of digital and connected devices in our everyday life, the IT infrastructures have to face an ever growing amount of users, computing requests and data generated.The Internet of Things have seen the development of computing platforms at the edge of the network to bridge the gap between the connected devices and the Cloud, called the Edge Computing.In the domain of High Performance Computing, the parallel programs executed on the platforms requires always more computing power in a search for improved performances.Besides, we observed in the past years a diversification of the hardware composing these infrastructures.This complexification of the (network of) computing platforms pose several optimisation challenges that can appear at different levels.In particular, it led to a need for better management systems to make an efficient usage of the heterogeneous resources composing these platforms.The work presented in this thesis focuses on resources optimisation problems for distributed and parallel platforms of the Edge Computing and High Performance Computing domains.In both cases, we study the modelling of the problems and propose methods and algorithms to optimise the resources management for better performances, in terms of quality of the solutions.The problems are studied from both theoretical and practical perspectives.More specifically, we study the resources management problems at multiple levels of the Qarnot Computing platform, an Edge Computing production platform mostly composed of computing resources deployed in heaters of smart-buildings.In this regard, we propose extensions to the Batsim simulator to enable the simulation of Edge Computing platforms and ease the design, development and comparison of data and jobs placement policies in such platforms.Then, we design a new temperature prediction method for smart-buildings and propose a formulation of a new scheduling problem with two-agents on multiple machines.In parallel, we study the problem of scheduling applications on hybrid multi-core machines in the objective of minimising the completion time of the overall application.We survey existing algorithms providing performance guarantees on the constructed schedules and propose two new algorithms for different settings of the problem, proving performance guarantees for both.Then, we conduct an experimental campaign to compare in practice the relative performance of the new algorithms with existing solutions in the literature
Lumineau, Nicolas. "Organisation et localisation de données hétérogènes et réparties sur un réseau Pair-à-Pair." Paris 6, 2005. http://www.theses.fr/2005PA066436.
Full textKretz, Vincent. "Intégration de données de déplacements de fluides dans la caractérisation de milieux poreux hétérogènes." Paris 6, 2002. http://www.theses.fr/2002PA066200.
Full textColonna, François-Marie. "Intégration de données hétérogènes et distribuées sur le web et applications à la biologie." Aix-Marseille 3, 2008. http://www.theses.fr/2008AIX30050.
Full textOver the past twenty years, the volume of data generated by genomics and biology has grown exponentially. Interoperation of publicly available or copyrighted datasources is difficult due to syntactic and semantic heterogeneity between them. Thus, integrating heterogeneous data is nowadays one of the most important field of research in databases, especially in the biological domain, for example for predictive medicine purposes. The work presented in this thesis is organised around two classes of integration problems. The first part of our work deals with joining data sets across several datasources. This method is based on a description of sources capabilities using feature logics. The second part of our work is a contribution to the development of a BGLAV mediation architecture based on semi-structured data, for an effortless and flexible data integration using the XQuery language
Kefi, Hassen. "Ontologies et aide à l'utilisateur pour l'interrogation de sources multiples et hétérogènes." Paris 11, 2006. http://www.theses.fr/2006PA112016.
Full textThe explosion in the number of information sources available on the Web multiplies the needs for multiple and heterogeneous data sources integration techniques. These techniques rest on the construction of a uniform view of the distributed data allowing to give to the user the feeling he queries a homogeneous and centralized system. The work undertaken in this thesis concerns ontologies as assistance tools to the interrogation of a server of information. The two aspects of ontologies which we treated are ontologies as a query refinement tool, on the one hand, and as an assistance for unified interrogation, on the other hand. Concerning the first aspect, we propose to gradually build, interactively with the user, more specific and more constrained requests until obtaining fewer and more relevant answers. Our approach is based on the combined use of related ontology and of Galois lattices. Concerning the second aspect, we propose a generic approach of alignment of ontologies implemented through a semi-automatic process. The approach that we propose applies in the presence of a dissymmetry in the structure of compared taxonomies. We propose to use together, in a precisely defined order, terminological, structural and semantic techniques. These two aspects were the subject of distinct works carried out within two projects. The first one was Picsel 2 project, carried out in collaboration with France Telecom R&D whose field of experimentation is tourism. The second was RNTL eDot project, whose applicability relates to the analysis of the bacteriological risk of food contamination
Lechevalier, Fabien. "Les fiducies de données personnelles de santé : étude illustrée des enjeux et bénéfices d’une gestion collective de la propriété des données personnelles de santé." Master's thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/67590.
Full textBabilliot, Alain. "Typologie critique des méthodes informatiques pour l'analyse des données en épidémiologie." Paris 9, 1988. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1988PA090033.
Full text