Dissertations / Theses on the topic 'Intégration de données de peptidomique'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Intégration de données de peptidomique.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Suwareh, Ousmane. "Modélisation de la pepsinolyse in vitro en conditions gastriques et inférence de réseaux de filiation de peptides à partir de données de peptidomique." Electronic Thesis or Diss., Rennes, Agrocampus Ouest, 2022. https://tel.archives-ouvertes.fr/tel-04059711.
Full textAddressing the current demographic challenges, “civilization diseases” and the possible depletion of food resources, require optimization of food utilization and adapting their conception to the specific needs of each target population. This requires a better understanding of the different stages of the digestion process. In particular, how proteins are hydrolyzed is a major issue, due to their crucial role in human nutrition. However, the probabilistic laws governing the action of pepsin, the first protease to act in the gastrointestinal tract, are still unclear.In a first approach based on peptidomic data, we demonstrate that the hydrolysis by pepsin of a peptidebond depends on the nature of the amino acid residues in its large neighborhood, but also on physicochemical and structural variables describing its environment. In a second step, and considering the physicochemical environment at the peptide level, we propose a nonparametric model of the hydrolysis by pepsin of these peptides, and an Expectation-Maximization type estimation algorithm, offering novel perspectives for the valorization of peptidomic data. In this dynamic approach, we integrate the peptide kinship network into the estimation procedure, which leads to a more parsimonious model that is also more relevant regarding biological interpretations
Le, Béchec Antony. "Gestion, analyse et intégration des données transcriptomiques." Rennes 1, 2007. http://www.theses.fr/2007REN1S051.
Full textAiming at a better understanding of diseases, transcriptomic approaches allow the analysis of several thousands of genes in a single experiment. To date, international standard initiatives have allowed the utilization of large quantity of data generated using transcriptomic approaches by the whole scientific community, and a large number of algorithms are available to process and analyze the data sets. However, the major challenge remaining to tackle is now to provide biological interpretations to these large sets of data. In particular, their integration with additional biological knowledge would certainly lead to an improved understanding of complex biological mechanisms. In my thesis work, I have developed a novel and evolutive environment for the management and analysis of transcriptomic data. Micro@rray Integrated Application (M@IA) allows for management, processing and analysis of large scale expression data sets. In addition, I elaborated a computational method to combine multiple data sources and represent differentially expressed gene networks as interaction graphs. Finally, I used a meta-analysis of gene expression data extracted from the literature to select and combine similar studies associated with the progression of liver cancer. In conclusion, this work provides a novel tool and original analytical methodologies thus contributing to the emerging field of integrative biology and indispensable for a better understanding of complex pathophysiological processes
Saïs, Fatiha. "Intégration sémantique de données guidée par une ontologie." Paris 11, 2007. http://www.theses.fr/2007PA112300.
Full textThis thesis deals with semantic data integration guided by an ontology. Data integration aims at combining autonomous and heterogonous data sources. To this end, all the data should be represented according to the same schema and according to a unified semantics. This thesis is divided into two parts. In the first one, we present an automatic and flexible method for data reconciliation with an ontology. We consider the case where data are represented in tables. The reconciliation result is represented in the SML format which we have defined. Its originality stems from the fact that it allows representing all the established mappings but also information that is imperfectly identified. In the second part, we present two methods of reference reconciliation. This problem consists in deciding whether different data descriptions refer to the same real world entity. We have considered this problem when data is described according to the same schema. The first method, called L2R, is logical: it translates the schema and the data semantics into a set of logical rules which allow inferring correct decisions both of reconciliation and no reconciliation. The second method, called N2R, is numerical. It translates the schema semantics into an informed similarity measure used by a numerical computation of the similarity of the reference pairs. This computation is expressed in a non linear equation system solved by using an iterative method. Our experiments on real datasets demonstrated the robustness and the feasibility of our approaches. The solutions that we bring to the two problems of reconciliation are completely automatic and guided only by an ontology
Delanaux, Rémy. "Intégration de données liées respectueuse de la confidentialité." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1303.
Full textIndividual privacy is a major and largely unexplored concern when publishing new datasets in the context of Linked Open Data (LOD). The LOD cloud forms a network of interconnected and publicly accessible datasets in the form of graph databases modeled using the RDF format and queried using the SPARQL language. This heavily standardized context is nowadays extensively used by academics, public institutions and some private organizations to make their data available. Yet, some industrial and private actors may be discouraged by potential privacy issues. To this end, we introduce and develop a declarative framework for privacy-preserving Linked Data publishing in which privacy and utility constraints are specified as policies, that is sets of SPARQL queries. Our approach is data-independent and only inspects the privacy and utility policies in order to determine the sequence of anonymization operations applicable to any graph instance for satisfying the policies. We prove the soundness of our algorithms and gauge their performance through experimental analysis. Another aspect to take into account is that a new dataset published to the LOD cloud is indeed exposed to privacy breaches due to the possible linkage to objects already existing in the other LOD datasets. In the second part of this thesis, we thus focus on the problem of building safe anonymizations of an RDF graph to guarantee that linking the anonymized graph with any external RDF graph will not cause privacy breaches. Given a set of privacy queries as input, we study the data-independent safety problem and the sequence of anonymization operations necessary to enforce it. We provide sufficient conditions under which an anonymization instance is safe given a set of privacy queries. Additionally, we show that our algorithms are robust in the presence of sameAs links that can be explicit or inferred by additional knowledge. To conclude, we evaluate the impact of this safety-preserving solution on given input graphs through experiments. We focus on the performance and the utility loss of this anonymization framework on both real-world and artificial data. We first discuss and select utility measures to compare the original graph to its anonymized counterpart, then define a method to generate new privacy policies from a reference one by inserting incremental modifications. We study the behavior of the framework on four carefully selected RDF graphs. We show that our anonymization technique is effective with reasonable runtime on quite large graphs (several million triples) and is gradual: the more specific the privacy policy is, the lesser its impact is. Finally, using structural graph-based metrics, we show that our algorithms are not very destructive even when privacy policies cover a large part of the graph. By designing a simple and efficient way to ensure privacy and utility in plausible usages of RDF graphs, this new approach suggests many extensions and in the long run more work on privacy-preserving data publishing in the context of Linked Open Data
Megdiche, Bousarsar Imen. "Intégration holistique et entreposage automatique des données ouvertes." Thesis, Toulouse 3, 2015. http://www.theses.fr/2015TOU30214/document.
Full textStatistical Open Data present useful information to feed up a decision-making system. Their integration and storage within these systems is achieved through ETL processes. It is necessary to automate these processes in order to facilitate their accessibility to non-experts. These processes have also need to face out the problems of lack of schemes and structural and sematic heterogeneity, which characterize the Open Data. To meet these issues, we propose a new ETL approach based on graphs. For the extraction, we propose automatic activities performing detection and annotations based on a model of a table. For the transformation, we propose a linear program fulfilling holistic integration of several graphs. This model supplies an optimal and a unique solution. For the loading, we propose a progressive process for the definition of the multidimensional schema and the augmentation of the integrated graph. Finally, we present a prototype and the experimental evaluations
Grison, Thierry. "Intégration de schémas de bases de données entité-association." Dijon, 1994. http://www.theses.fr/1994DIJOS005.
Full textHamdoun, Khalfallah Sana. "Construction d'entrepôts de données par intégration de sources hétérogènes." Paris 13, 2006. http://www.theses.fr/2006PA132039.
Full textThis work describes the construction of a data warehouse by the integration of heterogeneous data. These latter could be structured, semi-structured or unstructured. We propose a theoretical approach based on an integration environment definition. This environment is formed by data sources and inter-schema relationships between these sources ( equivalence and strict order relations). Our approach is composed of five steps allowing data warehouse component choice, global schema generation and construction of data warehouse views. Multidimensional schemas are also proposed. All the stages proposed in this work are implemented by the use of a functional prototype (using SQL and Xquery). Keywords Data Integration, data warehouses, heterogeneous data, inter-schema relationships, Relational, Object-relational, XML, SQL, Xquery
Lemoine, Frédéric. "Intégration, interrogation et analyse de données de génomique comparative." Paris 11, 2008. http://www.theses.fr/2008PA112180.
Full textOur work takes place within the « Microbiogenomics » project. Microbiogenomics aims at building a genomic prokaryotic data warehouse. This data warehouse gathers numerous data currently dispersed, in order to improve functional annotation of bacterial genomes. Within this project, our work contains several facets. The first one focuses mainly on the analyses of biological data. We are particularly interested in the conservation of gene order during the evolution of prokaryotic genomes. To do so, we designed a computational pipeline aiming at detecting the areas whose gene order is conserved. We then studied the relative evolution of the proteins coded by genes that are located in conserved areas, in comparison with the other proteins. This data were made available through the SynteView synteny visualization tool (http://www. Synteview. U-psud. Fr). Moreover, to broaden the analysis of these data, we need to cross them with other kinds of data, such as pathway data. These data, often dispersed and heterogeneous, are difficult to query. That is why, in a second step, we were interested in querying the Microbiogenomics data warehouse. We designed an architecture and some algorithms to query the data warehouse, while keeping the different points of view given by the sources. These algorithms were implemented in GenoQuery (http://www. Lri. Fr/~lemoine/GenoQuery), a prototype querying module adapted to a genomic data warehouse
Lozano, Espinosa Rafael. "Intégration de données vidéo dans un SGBD à objets." Université Joseph Fourier (Grenoble), 2000. http://www.theses.fr/2000GRE10050.
Full textJautzy, Olivier. "Intégration de sources de données hétérogènes : Une approche langage." Marne-la-vallée, ENPC, 2000. http://www.theses.fr/2000ENPC0002.
Full textMathieu, Jean. "Intégration de données temps-réel issues de capteurs dans un entrepôt de données géo-décisionnel." Thesis, Université Laval, 2011. http://www.theses.ulaval.ca/2011/28019/28019.pdf.
Full textIn the last decade, the use of sensors for measuring various phenomenons has greatly increased. As such, we can now make use of sensors to measure GPS position, temperature and even the heartbeats of a person. Nowadays, the wide diversity of sensor makes them the best tools to gather data. Along with this effervescence, analysis tools have also advanced since the creation of transactional databases, leading to a new category of tools, analysis systems (Business Intelligence (BI)), which respond to the need of the global analysis of the data. Data warehouses and OLAP (On-Line Analytical Processing) tools, which belong to this category, enable users to analyze big volumes of data, execute time-based requests and build statistic graphs in a few simple mouse clicks. Although the various types of sensor can surely enrich any analysis, such data requires heavy integration processes to be driven into the data warehouse, centerpiece of any decision-making process. The different data types produced by sensors, sensor models and ways to transfer such data are even today significant obstacles to sensors data streams integration in a geo-decisional data warehouse. Also, actual geo-decisional data warehouses are not initially built to welcome new data on a high frequency. Since the performances of a data warehouse are restricted during an update, new data is usually added weekly, monthly, etc. However, some data warehouses, called Real-Time Data Warehouses (RTDW), are able to be updated several times a day without letting its performance diminish during the process. But this technology is not very common, very costly and in most of cases considered as "beta" versions. Therefore, this research aims to develop an approach allowing to publish and normalize real-time sensors data streams and to integrate it into a classic data warehouse. An optimized update strategy has also been developed so the frequent new data can be added to the analysis without affecting the data warehouse performances.
Verdin, David. "Intégration et recherche de données environnementales dans un système d'information." Rennes, Agrocampus Ouest, 2006. http://www.theses.fr/2006NSARG003.
Full textCohen-Boulakia, Sarah. "Intégration de données biologiques : sélection de sources centrée sur l'utilisateur." Paris 11, 2005. http://www.theses.fr/2005PA112243.
Full textOur work takes place in the context of data integration which aims at developing solutions to offer a uniform access to multiple, distributed and heterogeneous biological databases. Life sciences are continuously evolving so that the number and size of new sources providing specialized information in biological sciences have increased exponentially in the last few years. Scientists are therefore frequently faced with the problem of the integration of their data with information from multiple heterogeneous sources and data analysis with bioinformatics tools. They have thus to select sources and tools when interpreting their data. The diversity of sources and tools available makes it increasingly difficult to make this selection without assistance. Our work was developed following a thorough study of scientists' needs during querying and data management. After interviewing scientists working in various domains, we found that biologists express preferences concerning the sources to be queried and that the querying process itself -- the strategy followed -- differs between scientists. In response to these findings, we have first introduced a cooperative mediator, making it possible to meets some of the key requirements of experts. Secondly, we have proposed two modules to select data sources: DSS then BioGuide which generalizes DSS. BioGuide (http://www. Lri. Fr/~cohen/bioguide/bioguide. Html) is a user-centric framework which helps the scientists to choose suitable sources and tools, find complementary information in sources, and deal with divergent data. It provides answers respecting the preferences of the user and obtained following his strategy
Monceyron, Jean-Luc. "Intégration des données et conception des bâtiments : cas de l'acoustique." Lyon, INSA, 1996. http://www.theses.fr/1996ISAL0119.
Full textA system analysis of the building design process leads to the identification of some islands of information: software currently in use during the design process are unable to exchange product data describing a building project, primarily because of the partitioning of knowledge characterizing the meaning to the involved information. In order to come out of this isolation, multi-sector research works dedicated to product model data representation, exchange or sharing, have contributed to the implementation of the ISO 10303 STEP standard. Our research works lie in implementing these developments for the building design field, with an emphasis on the data exchange between an architect (in runtime via its CAD system) and an acoustician (via a knowledge-based system dedicated to the compliance checking of the last French acoustics regulation). To meet this goal, the acoustician's viewpoint on the project was interpreted into a data model, and we re-used an integration framework defined into the ESPRIT project 7280 ATLAS, based on the STEP standard. Furthermore, the exchange with the architect's view model required the specification of data mapping. It was achieved thanks to a rule-based approach. This investigation, combined with the study of several modelling experiments in the building industry field, have been brought into general for the integration of a design field in a wider framework for computer aided building design. The proposed method describes the activities required for the development of a view model and is based on a three-layer architecture: resources, domain and field, the latter reflecting the concept of an actor view
Hoebeke, Mark. "Intégration de données et prédiction d'interactions protéiques chez Bacillus subtilis." Versailles-St Quentin en Yvelines, 2005. http://www.theses.fr/2005VERS0004.
Full textLe travail de thèse concerne en grande partie la mise àdisposition de la communauté des biologistes d'applications bioinformatiques. Y sont abordés aussi bien les aspects liés àl'intégration des données que ceux relevant des interfaces utilisateur. Dans le cadre de cette thèse, plusieurs bases de données ont été mises en production : une base de données de résultats expérimentaux dédiée aux interactions protéiques (SPiD), et une base de donnée plus intégrative regroupant aussi bien des génomes annotés que des résultats d'analyse in silico (SeqDB/ORIGAMI). Chacune de ces bases a été munie d'interfaces d'accès adaptées (interface graphique dynamique pour la base d'interactions, interface programmatique pour la base de données intégrative). Un client générique pour l'exploration simultanée de plusieurs génomes annotés, complétés par des résultats d'analyse a également été développé et trouve notamment son utilité dans l'exploitation des résultats contenus dans ORIGAMI. Ces différentes ressources ont été utilisées pour les prédictions d'interactions protéiques dans B. Subtilis. Ces prédictions se fondent d'une part sur les relations d'orthologie entre des protéines appartenant à des réseaux d'interaction chez d'autres espèces, et les protéines de B. Subtilis. Et d'autre part, elles utilisent les différents niveaux de conservation du voisinage des gènes dans un ensemble de génomes complètement séquencés (fusion de gènes, voisinage immédiat, voisinage plus éloigné)
Gadchaux, Christian. "Inclusion des étudiants en situation de handicap à l’université : approches croisées anthropologique et didactique, analyse de données d’entretiens cliniques et de données vidéo." Rouen, 2016. http://www.theses.fr/2016ROUEL027.
Full textThe university growth in the number of disabled students (2005-103 law), and is facing the complex challenge of inclusion (2013-595 law). The law on equal opportunities and rights made it compulsory admission, and the law school and refounding the Republic requires the establishment of inclusive university. We question the issue of inclusion of students with disabilities in both the socialization aspect of the anthropological approach in the aspect of the production-transmission of academic knowledge and their accessibilisation by the didactic approach. The inventory of the issue is in a critical historical analysis of texts and institutions, in order to realize the conceptual evolution of the concepts according to their origin. We analyze and critique of official documents and institutional positioning. Building on scientific research we return to our account some conceptual elements (liminality, stigmata, rites of interaction, rites of institution) to continue the anthropological approach to disability. Didactics of joint action that we present seems quickly respond to relational needs inclusion in the academic higher education. Our thesis adopts a clinical and collaborative approach, giving pride of place to actors, students with disabilities or not, and teachers. We listen to three students with disabilities, three regular students (all Bachelor, two of Sociology and Anthropology) and two teachers (one experienced, the other rookie). These clinical interviews collected at Rogerian method of active listening give rise to a set of corpus analyzed in the thesis and delivered in full in appendices to serve as empirical contributions to scientific research. We urge cross approach, anthropological and didactic, description and video analysis filmed TD sociology and cross analysis of experienced teachers. What about the relationship and the production-transmission of knowledge in tutorials, with and for students with disabilities at the University today? If the clinical lighting does not allow us to report comprehensively on developments of the current reality, at least it allows us better to ask some questions in the light of societal and anthropological issues of inclusion of cognitive socialization university
Wechman, Christophe. "Intégration de méthodes de data mining dans le domaine de l'olfaction." Orléans, 2005. http://www.theses.fr/2005ORLE2047.
Full textMoriceau, Véronique. "Intégration de données dans un système question-réponse sur le Web." Toulouse 3, 2007. http://www.theses.fr/2007TOU30019.
Full textIn the framework of question-answering systems on the Web, our main goals are to model, develop and evaluate a system which can, from a question in natural language, search for relevant answers on the Web and generate a synthetic answer, even if the search engine selected several candidate answers. We focused on temporal and numerical questions. Our system deals with : - the integration of data from candidate answers by using a knowledge base and knowledge extracted from the Web. This component allows the detection of data inconsistencies and deals with user expectations in order to produce a relevant answer, - the generation of synthetic answers in natural language which are relevant w. R. T users. Indeed, generated answers have to be short, understandable and have to express the cooperative know-how which has been used to solve data inconsistencies. We also propose evaluation methods to evaluate our system from a technical and cognitive point of view
Essid, Mehdi. "Intégration des données et applications hétérogènes et distribuées sur le Web." Aix-Marseille 1, 2005. http://www.theses.fr/2005AIX11035.
Full textAlamé, Melissa. "Intégration de données et caractérisation du microenvironnement tumoral de tumeurs rares." Thesis, Montpellier, 2020. http://www.theses.fr/2020MONTT046.
Full textThe development of high-throughput technologies, especially Next Generation Sequencing, has triggered considerable advances in tumor understanding and molecular classification. Patient subgroups for a same tumor have been defined and characterized. Those subgroups are typically associated with a particular prognosis or eligible to a specific targeted therapy. These progresses paved the way towards personalized medicine.The understanding of the contribution of the tumor microenvironment (TME) to disease aggressiveness, progression, and therapy resistance is another revolution in cancer biology and patient care. The contribution of the aforementioned high-throughput technologies was essential. At the era of immunotherapy, the sub-classification of tumors based on their TME composition identified patient subgroups correlated to survival and to their response to this particular class of drugs. Despite a formidable community effort, the molecular and immunological classification of tumors has not been completed for every cancer, some rare and aggressive entities still require thorough characterization. Moreover, most TME studies have focused on the cellular composition and they neglected the mapping of the intercellular communications networks occurring in neoplasms. The advent of single-cell technologies is filling this gap, but with a strong focus on the most frequent cancers.In my thesis, I have both deployed advanced data integration methods and a novel approach to infer ligand-receptor networks relied on a database (LRdb), which is developed by the Colinge Lab, to characterize the TME of two rare tumors, Salivary Duct Carcinoma (SDC) and Primary Central Nervous System Diffuse Large B-Cell Lymphoma (PCNSL). I have combined classical – yet advanced – bioinformatic and multivariate statistics methods integrating bulk transcriptomics and proteomics data, including fresh and TCGA data. Those computational techniques were supplemented with immunofluorescence and immunohistochemistry coupled with digital imaging to obtain experimental validations. To accommodate limited patient cohorts, I have searched for highly coherent messages at all the levels of my analyses. I also devoted important efforts relating our findings with the literature to put them in a clinical perspective. In particular, our approach revealed TME groups of tumors with particular prognosis, immune evasion and therapy resistance mechanisms, several clinical biomarkers, and new therapeutic perspectives
Amouroux, Rémy. "Intégration de logiciels : utilisation de données réactives pour la construction d'applications." Université Joseph Fourier (Grenoble), 1996. http://www.theses.fr/1996GRE10145.
Full textApres une étude du domaine, nous proposons une approche qui consiste en un rapprochement de la coopération et du partage des données par le biais d'un modèle de données adapté. Ce modèle reprend les concepts du modèle objet, étendu par les notions de relations entre données et de réactivité des données. L'utilisation de règles actives associées aux objets partages par les composants permet de séparer la définition des interactions entre composants de la de ceux-ci. Ces choix permettent l'intégration dans une application de composants existants, mais aussi la réalisation de composants adaptables à plusieurs applications
Nous complétons nos propositions, d'une part, en définissant une architecture type et des services de base pour l'intégration de logiciels, et d'autre part en proposant au concepteur d'une application intégrée des principes de construction facilitant la maintenance et l'évolution de l'application
Ce travail a été valide par la mise en œuvre concrète d'un environnement d'intégration de logiciels qui nous a permis d'expérimenter notre approche par la construction d'environnements de développement logiciel
Wery, Méline. "Identification de signature causale pathologie par intégration de données multi-omiques." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S071.
Full textSystematic erythematosus lupus is an example of a complex, heterogeneous and multifactorial disease. The identification of signature that can explain the cause of a disease remains an important challenge for the stratification of patients. Classic statistical analysis can hardly be applied when population of interest are heterogeneous and they do not highlight the cause. This thesis presents two methods that answer those issues. First, a transomic model is described in order to structure all the omic data, using semantic Web (RDF). Its supplying is based on a patient-centric approach. SPARQL query interrogates this model and allow the identification of expression Individually-Consistent Trait Loci (eICTLs). It a reasoning association between a SNP and a gene whose the presence of the SNP impact the variation of its gene expression. Those elements provide a reduction of omics data dimension and show a more informative contribution than genomic data. This first method are omics data-driven. Then, the second method is based on the existing regulation dependancies in biological networks. By combining the dynamic of biological system with the formal concept analysis, the generated stable states are automatically classified. This classification enables the enrichment of biological signature, which caracterised a phenotype. Moreover, new hybrid phenotype is identified
Gentilhomme, Théophile. "Intégration multi-échelles des données de réservoir et quantification des incertitudes." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0089/document.
Full textIn this work, we propose to follow a multi-scale approach for spatial reservoir properties characterization using direct (well observations) and indirect (seismic and production history) data at different resolutions. Two decompositions are used to parameterize the problem: the wavelets and the Gaussian pyramids. Using these parameterizations, we show the advantages of the multi-scale approach with two uncertainty quantification problems based on minimization. The first one concerns the simulation of property fields from a multiple points geostatistics algorithm. It is shown that the multi-scale approach based on Gaussian pyramids improves the quality of the output realizations, the match of the conditioning data and the computational time compared to the standard approach. The second problem concerns the preservation of the prior models during the assimilation of the production history. In order to re-parameterize the problem, we develop a new 3D grid adaptive wavelet transform, which can be used on complex reservoir grids containing dead or zero volume cells. An ensemble-based optimization method is integrated in the multi-scale history matching approach, so that an estimation of the uncertainty is obtained at the end of the optimization. This method is applied on several application examples where we observe that the final realizations better preserve the spatial distribution of the prior models and are less noisy than the realizations updated using a standard approach, while matching the production data equally well
Guillemot, Vincent. "Application de méthodes de classification supervisée et intégration de données hétérogènes pour des données transcriptomiques à haut-débit." Phd thesis, Université Paris Sud - Paris XI, 2010. http://tel.archives-ouvertes.fr/tel-00481822.
Full textAlili, Hiba. "Intégration de données basée sur la qualité pour l'enrichissement des sources de données locales dans le Service Lake." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLED019.
Full textIn the Big Data era, companies are moving away from traditional data-warehouse solutions whereby expensive and timeconsumingETL (Extract, Transform, Load) processes are used, towards data lakes in order to manage their increasinglygrowing data. Yet the stored knowledge in companies’ databases, even though in the constructed data lakes, can never becomplete and up-to-date, because of the continuous production of data. Local data sources often need to be augmentedand enriched with information coming from external data sources. Unfortunately, the data enrichment process is one of themanual labors undertaken by experts who enrich data by adding information based on their expertise or select relevantdata sources to complete missing information. Such work can be tedious, expensive and time-consuming, making itvery promising for automation. We present in this work an active user-centric data integration approach to automaticallyenrich local data sources, in which the missing information is leveraged on the fly from web sources using data services.Accordingly, our approach enables users to query for information about concepts that are not defined in the data sourceschema. In doing so, we take into consideration a set of user preferences such as the cost threshold and the responsetime necessary to compute the desired answers, while ensuring a good quality of the obtained results
Vo-Van, Claudine. "Analyse de données pharmacocinétiques fragmentaires : intégration dans le développement de nouvelles molécules." Paris 5, 1994. http://www.theses.fr/1994PA05P044.
Full textImbert, Alyssa. "Intégration de données hétérogènes complexes à partir de tableaux de tailles déséquilibrées." Thesis, Toulouse 1, 2018. http://www.theses.fr/2018TOU10022/document.
Full textThe development of high-throughput sequencing technologies has lead to a massive acquisition of high dimensional and complex datasets. Different features make these datasets hard to analyze : high dimensionality, heterogeneity at the biological level or at the data type level, the noise in data (due to biological heterogeneity or to errors in data) and the presence of missing data (for given values or for an entire individual). The integration of various data is thus an important challenge for computational biology. This thesis is part of a large clinical research project on obesity, DiOGenes, in which we have developed methods for data analysis and integration. The project is based on a dietary intervention that was led in eight Europeans centers. This study investigated the effect of macronutrient composition on weight-loss maintenance and metabolic and cardiovascular risk factors after a phase of calorie restriction in obese individuals. My work have mainly focused on transcriptomic data analysis (RNA-Seq) with missing individuals and data integration of transcriptomic (new QuantSeq protocol) and clinic datasets. The first part is focused on missing data and network inference from RNA-Seq datasets. During longitudinal study, some observations are missing for some time step. In order to take advantage of external information measured simultaneously to RNA-Seq data, we propose an imputation method, hot-deck multiple imputation (hd-MI), that improves the reliability of network inference. The second part deals with an integrative study of clinical data and transcriptomic data, measured by QuantSeq, based on a network approach. The new protocol is shown efficient for transcriptome measurement. We proposed an analysis based on network inference that is linked to clinical variables of interest
Tran, Ba-Huy. "Une approche sémantique pour l’exploitation de données environnementales : application aux données d’un observatoire." Thesis, La Rochelle, 2017. http://www.theses.fr/2017LAROS025.
Full textThe need to collect long-term observations for research on environmental issues led to the establishment of "Zones Ateliers" by the CNRS. Thus, for several years, many databases of a spatio-temporal nature are collected by different teams of researchers. To facilitate transversal analysis of different observations, it is desirable to cross-reference information from these data sources. Nevertheless, these sources are constructed independently of each other, which raise problems of data heterogeneity in the analysis.Therefore, this thesis proposes to study the potentialities of ontologies as both objects of modeling, inference, and interoperability. The aim is to provide experts in the field with a suitable method for exploiting heterogeneous data. Being applied in the environmental domain, ontologies must take into account the spatio-temporal characteristics of these data. As the need for modeling concepts and spatial and temporal operators, we rely on the solution of reusing the ontologies of time and space. Then, a spatial-temporal data integration approach with a reasoning mechanism on the relations of these data has been introduced. Finally, data mining methods have been adapted to spatio-temporal RDF data to discover new knowledge from the knowledge-base. The approach was then applied within the Geminat prototype, which aims to help understand farming practices and their relationships with the biodiversity in the "zone atelier Plaine and Val de Sèvre". From data integration to knowledge analysis, it provides the necessary elements to exploit heterogeneous spatio-temporal data as well as to discover new knowledge
Degré, Yvon. "Intégration des données de la mécanique aux données de l'automatique : une contribution au poste de travail du génie automatique." Paris, ENSAM, 2001. http://www.theses.fr/2001ENAM0006.
Full textChulyadyo, Rajani. "Un nouvel horizon pour la recommandation : intégration de la dimension spatiale dans l'aide à la décision." Thesis, Nantes, 2016. http://www.theses.fr/2016NANT4012/document.
Full textNowadays it is very common to represent a system in terms of relationships between objects. One of the common applications of such relational data is Recommender System (RS), which usually deals with the relationships between users and items. Probabilistic Relational Models (PRMs) can be a good choice for modeling probabilistic dependencies between such objects. A growing trend in recommender systems is to add spatial dimensions to these objects, and make recommendations considering the location of users and/or items. This thesis deals with the (not much explored) intersection of three related fields – Probabilistic Relational Models (a method to learn probabilistic models from relational data), spatial data (often used in relational settings), and recommender systems (which deal with relational data). The first contribution of this thesis deals with the overlapping of PRM and recommender systems. We have proposed a PRM-based personalized recommender system that is capable of making recommendations from user queries in cold-start systems without user profiles. Our second contribution addresses the problem of integrating spatial information into a PRM
Michel, Franck. "Intégrer des sources de données hétérogènes dans le Web de données." Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4002/document.
Full textTo a great extent, the success of the Web of Data depends on the ability to reach out legacy data locked in silos inaccessible from the web. In the last 15 years, various works have tackled the problem of exposing various structured data in the Resource Description Format (RDF). Meanwhile, the overwhelming success of NoSQL databases has made the database landscape more diverse than ever. NoSQL databases are strong potential contributors of valuable linked open data. Hence, the object of this thesis is to enable RDF-based data integration over heterogeneous data sources and, in particular, to harness NoSQL databases to populate the Web of Data. We propose a generic mapping language, xR2RML, to describe the mapping of heterogeneous data sources into an arbitrary RDF representation. xR2RML relies on and extends previous works on the translation of RDBs, CSV/TSV and XML into RDF. With such an xR2RML mapping, we propose either to materialize RDF data or to dynamically evaluate SPARQL queries on the native database. In the latter, we follow a two-step approach. The first step performs the translation of a SPARQL query into a pivot abstract query based on the xR2RML mapping of the target database to RDF. In the second step, the abstract query is translated into a concrete query, taking into account the specificities of the database query language. Great care is taken of the query optimization opportunities, both at the abstract and the concrete levels. To demonstrate the effectiveness of our approach, we have developed a prototype implementation for MongoDB, the popular NoSQL document store. We have validated the method using a real-life use case in Digital Humanities
Deba, El Abbassia. "Transformation de modèles de données : intégration de la métamodélisation et de l'approche grammaticale." Toulouse 3, 2007. http://thesesups.ups-tlse.fr/220/.
Full textFollowing recent discoveries about the several roles of non-coding RNAs (ncRNAs), there is now great interest in identifying these molecules. Numerous techniques have been developed to localize these RNAs in genomic sequences. We use here an approach which supposes the knowledge of a set of structural elements called signature that discriminate an ncRNA family. In this work, we combine several pattern-matching techniques with the weighted constraint satisfaction problem framework. Together, they make it possible to model our biological problem, to describe accurately the signatures and to give the solutions a cost. We conceived filtering techniques as well as novel pattern-matching algorithms. Furthermore, we designed a software called DARN! that implements our approach and another tool that automatically creates signatures. These tools make it possible to localize efficiently new ncRNAs
Maisongrande, Philippe. "Modélisation du cycle du carbone dans la biosphère terrestre : intégration de données satellitaires." Toulouse, INPT, 1996. http://www.theses.fr/1996INPT131H.
Full textBlirando, Catherine. "Intégration du multimédia dans une architecture de bases de données à objets distribués." Versailles-St Quentin en Yvelines, 2000. http://www.theses.fr/2000VERS0022.
Full textLarnier, Hugo. "Intégration des données d'observatoires magnétiques dans l'interprétation de sondages magnétotelluriques : acqusition, traitement, interprétation." Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAH003/document.
Full textIn this manuscript, we detail the application of continuous wavelet transform to processing schemes for the detection and the characterisation of geomagnetic and atmospheric sources. Presented techniques are based on time-frequency properties of electromagnetic (EM) waves observed in magnetotellurics (MT) time series. We detail the application of these detection procedures in a MT processing scheme. To recover MT response functions, we use robust statistics and a hierarchical bootstrap approach for uncertainties determination. Interpretation of two datasets are also presented. The first MT study deals with the caracterisation of the resistivity distribution below the French National magnetic observatory of Chambon-la-Forêt. The second study details the interpretation of new MT soundings acquired in March 2016 in the Trisuli valley, Nepal. The main objective of this campaign was to compare the new soundings with an old campaign in 1996. We discuss topography effects on MT soundings and their implication on the resistivity distribution. We also introduce a new interpretation of the resistivity distribution in Trisuli valley
Mariette, Jérôme. "Apprentissage statistique pour l'intégration de données omiques." Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30276/document.
Full textThe development of high-throughput sequencing technologies has lead to produce high dimensional heterogeneous datasets at different living scales. To process such data, integrative methods have been shown to be relevant, but still remain challenging. This thesis gathers methodological contributions useful to simultaneously explore heterogeneous multi-omics datasets. To tackle this problem, kernels and kernel methods represent a natural framework because they allow to handle the own nature of each datasets while permitting their combination. However, when the number of sample to process is high, kernel methods suffer from several drawbacks: their complexity is increased and the interpretability of the model is lost. A first part of my work is focused on the adaptation of two exploratory kernel methods: the principal component analysis (K-PCA) and the self-organizing map (K-SOM). The proposed adaptations first address the scaling problem of both K-SOM and K-PCA to omics datasets and second improve the interpretability of the models. In a second part, I was interested in multiple kernel learning to combine multiple omics datasets. The proposed methods efficiency is highlighted in the domain of microbial ecology: eight TARA oceans datasets are integrated and analysed using a K-PCA
Dugré, Mathieu. "Conception et réalisation d'un entrepôt de données : intégration à un système existant et étape nécessaire vers le forage de données." Thèse, Université du Québec à Trois-Rivières, 2004. http://depot-e.uqtr.ca/4679/1/000108834.pdf.
Full textCoulibaly, Lacina. "Intégration de données multisources de télédétection et de données morphométriques pour la cartographie des formations meubles région de Cochabamba en Bolivie." Thèse, Université de Sherbrooke, 2001. http://savoirs.usherbrooke.ca/handle/11143/2717.
Full textHadj, Kacem Nabil. "Intégration des données sismiques pour une modélisation statique et dynamique plus réaliste des réservoirs." Paris, ENMP, 2009. http://www.theses.fr/2006ENMP1407.
Full textColonna, François-Marie. "Intégration de données hétérogènes et distribuées sur le web et applications à la biologie." Aix-Marseille 3, 2008. http://www.theses.fr/2008AIX30050.
Full textOver the past twenty years, the volume of data generated by genomics and biology has grown exponentially. Interoperation of publicly available or copyrighted datasources is difficult due to syntactic and semantic heterogeneity between them. Thus, integrating heterogeneous data is nowadays one of the most important field of research in databases, especially in the biological domain, for example for predictive medicine purposes. The work presented in this thesis is organised around two classes of integration problems. The first part of our work deals with joining data sets across several datasources. This method is based on a description of sources capabilities using feature logics. The second part of our work is a contribution to the development of a BGLAV mediation architecture based on semi-structured data, for an effortless and flexible data integration using the XQuery language
Kretz, Vincent. "Intégration de données de déplacements de fluides dans la caractérisation de milieux poreux hétérogènes." Paris 6, 2002. http://www.theses.fr/2002PA066200.
Full textHamidou, Salim. "Analyse de l'image en multirésolution : intégration de données multiéchelles dans un système d'information géographique." Paris 12, 1997. http://www.theses.fr/1997PA120029.
Full textMaranzana, Roland. "Intégration des fonctions de conception et de fabrication autour d'une base de données relationnelle." Valenciennes, 1988. https://ged.uphf.fr/nuxeo/site/esupversions/c2b1155d-a9f8-446e-b064-3b228bf8511b.
Full textBaril, Daniel. "Modélisation de l'érosion hydrique par intégration de données multisources à un système d'information géographique." Mémoire, Université de Sherbrooke, 1989. http://hdl.handle.net/11143/11135.
Full textMcHugh, Rosemarie. "Intégration de la structure matricielle dans les cubes spatiaux." Thesis, Université Laval, 2008. http://www.theses.ulaval.ca/2008/25758/25758.pdf.
Full textLagorce, Jean-Bernard. "Migration d'objet : intégration dans les bases de données à objets et solution pour l'évolution de schéma." Paris 11, 2002. http://www.theses.fr/2002PA112021.
Full textThis document studies the definition, properties and uses of object migration in Object-Oriented Databases. This study has focus on tree main subjects. The first subject is the study of the problems related to the object migration primitive itself. We mainly see these problems as ill-typed references problems. We have studied the way these problems had to be solved and how add this primitive to instance update languages. We have then proposed an instance update language featuring an object migration primitive. The second subject focuses on the way that object migration solves the schema evolution problem. We show that it is necessary to reorganize the instance and that, in this situation, object migration offers a powerful primitive. The last subject is about integration of object migration inside a database schema, allowing definition of a dynamic schema. We define a mechanism consisting of migration constraints allowing to specify migration sequences in the database. We also show how it interacts with instance update languages by restricting the set of allowed programs upon the database. . We have implemented two instance update languages featuring object migration. Even though object-oriented databases seem to decline, we think that the dynamic problems studied in this thesis focus beyond a specific model, language or programming paradigm. We think that these problems will appear for example in XML
Barriot, Roland. "Intégration des connaissances biologiques à l'échelle de la cellule." Bordeaux 1, 2005. http://www.theses.fr/2005BOR13100.
Full textKanellos, Léonidas. "Information juridique, intégration technologique et connaissance du droit dans l'Europe communautaire." Montpellier 1, 1990. http://www.theses.fr/1990MON10035.
Full textThe application of computerised information systems within the judicial and legal sector provokes newquestions and creates delicate problems. The author is trying to analyse the actual and potential impact of information technologies, "traditional" and modern, such as data bases, telematics, office automation, electronic information retrieval, cd-rom, expert systems, computer assisted learning to the accessibility of legal norms. Or, this analysis of information technologies applied on the legal sector has to be accompagnied with an analysis of the specific conditions of elaboration of law itself in national and european levels
Carpen-Amarie, Alexandra. "Utilisation de BlobSeer pour le stockage de données dans les Clouds: auto-adaptation, intégration, évaluation." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2011. http://tel.archives-ouvertes.fr/tel-00696012.
Full textBrisson, Laurent. "Intégration de connaissances expertes dans le processus de fouille de données pour l'extraction d'informations pertinentes." Phd thesis, Université Nice Sophia Antipolis, 2006. http://tel.archives-ouvertes.fr/tel-00211946.
Full text